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1  Introduction 


This  report  gives  a  fairly  complete  introduction  to  the  Structured  Singular  Value(  n)  and 
details  some  of  the  latest  results.  The  /i-based  methods  discussed  here  have  proven  to  be 
useful  for  analyzing  the  performance  and  robustness  properties  of  linear  feedback  systems. 
This  report  also  describes  the  recent  nonlinear  extensions. 

It  is  assumed  that  the  reader  is  familiar  with  the  general  fi  analysis  framework.  In  this  con¬ 
text,  analysis  refers  to  the  process  of  determining  whether  a  system  with  a  given  controller 
has  desired  characteristics,  whereas  synthesis  refers  to  the  process  of  finding  a  controller 
that  gives  desired  characteristics,  usually  expressed  in  terms  of  some  analysis  method. 
This  is  the  fairly  standard  usage  of  these  terms  in  the  control  community.  It  should  be 
obvious  that  the  question  of  analysis  must  be  settled  to  some  degree  before  a  reasonable 
synthesis  problem  can  be  posed.  The  formal  analysis  and  synthesis  techniques  discussed 
are  only  some  of  the  methods  that  might  make  up  the  overall  process  of  engineering  design. 

The  general  framework  to  be  used  is  illustrated  in  the  diagram  in  the  figure  below. 


Figure  1.1  General  Interconnection 

Any  linear  interconnection  of  inputs,  outputs,  commands,  perturbations,  and  a  controller 
can  be  rearranged  to  match  this  diagram.  For  the  purpose  of  analysis  the  controller  may 
be  thought  of  as  just  another  system  component  and  the  diagram  reduces  to  that  below 


Figure  1.2  Perturbed  Disturbance-to-error 

The  analysis  problem  involves  determining  whether  the  error  e  remains  in  a  desired  set  for 
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sets  of  inputs  d  and  perturbations  A.  Analysis  methods  differ  on  the  description  of  these 
sets  and  the  assumptions  on  the  interconnection  structure  G.  For  now,  G  will  be  taken 
to  be  a  linear,  time- invariant,  lumped  system  and  be  represented  by  a  rational  transfer 
function.  The  convolution  kernel  associated  with  G  will  be  denoted  as  g,  so  G  is  a  real- 
rational  matrix  function  of  a  complex  variable  and  g  is  a  matrix  function  of  time.  The 
interconnection  structure  G  can  be  partitioned  so  that  the  transfer  function  from  d  to  e 
can  be  expressed  as  the  linear  fractional  transformation 

c  =  FU(G,  A)  d  =  [C22  +  Gn  A(J  -  Gn A)"1  Gu\  d. 

The  external  input  d  is  an  additive  signal  entering  the  system  and  is  typically  used  to 
model  disturbances,  commands,  and  noise.  It  is  generally  inadequate  in  modeling  systems 
for  control  design  to  consider  uncertainty  only  in  the  form  of  uncertain  additive  signals. 
The  system  model  itself  typically  has  uncertainty  which  can  have  a  significant  impact 
on  system  performance.  This  uncertainty  is  a  consequence  of  unmodeled  dynamics  and 
parameter  variations  and  is  modeled  as  the  perturbations  A  to  the  nominal  interconnection 
structure  G.  Note  that  the  uncertainty  modeled  as  A  has  a  very  different  effect  from  that 
of  d  on  the  performance  of  the  system.  For  example,  perturbations  can  cause  a  nominally 
stable  system  to  become  unstable,  which  d  cannot  do. 

At  the  heart  of  any  theory  about  control  jure  the  assumptions  made  about  G,  d  and 
A,  as  well  as  the  performance  specifications  on  e.  These  assumptions  determine  the 
analysis  methods  which  can  be  applied  to  obtain  conclusions  about  system  performance. 
A  desirable  objective  is  to  make  weak  assumptions  but  still  arrive  at  strong  conclusions 
and  the  inevitable  tradeoff  implied  by  this  objective  drives  the  development  of  new  theory. 
The  control  theoreticians  role  may  be  viewed  as  one  of  developing  methods  that  allow 
the  control  engineer  to  make  assumptions  which  seem  relatively  natural  and  physically 
motivated.  The  ultimate  question  of  the  applicability  of  any  mathematical  technique  to  a 
specific  physical  problem  will  always  require  a  "‘leap  of  faith”  on  the  part  of  the  engineer 
and  the  theoretician  can  only  hope  to  make  this  leap  smaller. 

It  is  beyond  the  scope  of  this  report  to  give  a  thorough  discussion  of  the  relationship 
between  models  and  the  physical  systems  they  represent.  Attention  will  be  to  the  type 
of  models  that  arise  in  the  n  framework  and  have  proven  useful  in  applications.  The 
particular  focus  is  on  techniques  that  allow  very  precise  analysis  of  systems  which  have 
fairly  standard  performance  requirements  and  uncertainty  models  in  terms  of  additive 
noise  and  plant  perturbations.  While  the  “best”  assumptions  for  engineering  purposes 
will  always  be  a  matter  of  debate,  it  is  clear  that  for  any  given  set  of  assumptions  it 
is  desirable  to  have  very  precise  analysis  techniques.  The  ideal  would  be  necessary  and 
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sufficient  conditions  for  the  satisfaction  of  a  performance  specification  in  the  presence  of 
sets  of  inputs  and  perturbations.  Additionally,  the  conditions  should  be  computable  or 
should  at  least  yield  bounds  which  give  useful  estimates  of  system  performance.  With 
such  methods,  the  engineer  can  focus  directly  on  the  relationship  between  uncertainty 
assumptions  and  system  performance  without  worrying  about  potential  gaps  caused  by 
inadequate  analysis  techniques. 

The  layout  is  as  follows.  Section  2  describes  how  parametric  uncertainty  in  state  space 
models  can  be  rearranged  into  the  ft  framework.  Section  3  defines  ft  and  its  basic  properties, 
along  with  a  few  examples.  Section  4  is  a  well  known  result  about  an  exact  expression  for  ft. 
Section  5  describes  some  mathematical  preliminaries  that  are  used  in  subsequent  sections 
concerning  the  computable  upper  bound.  Section  6  develops  theory  for  the  computation 
of  the  upper  bound,  and  relates  the  upper  bound  to  ft.  Section  7  explores  guaranteed 
relationships  between  the  upper  bound  and  ft  for  various  block  structures.  Section  8  is  a 
exposition  of  linear  fractional  transformations  on  structured  uncertainties,  and  how  both 
ft  and  the  upper  bound  can  describe  their  behavior.  Section  9  gives  robustness  tests  for 
a  special  class  of  uncertain  difference  equations.  The  extension  of  the  /z-based  methods 
to  time- varying  and  nonlinear  controllers  is  outlined  here.  Section  10  is  a  frequency  do¬ 
main/small  gain  approach  to  the  problem  considered  in  section  9.  Section  11  deals  with 
frequency  domain  ft  tests.  This  material  is  standard,  and  is  what  is  usually  associated 
with  ft.  Section  12  presents  counterexamples  showing  that  the  upper  bound  and  ft  are 
different.  Section  13  describes  a  power-like  algorithm,  reminiscent  of  power  algorithms  for 
eigenvalues  and  singular  values,  that  can  be  used  to  get  lower  bounds  for  ft.  Section  14 
is  an  illustrative  example,  outlining  the  various  analysis  tests  and  possible  conclusions. 
Finally,  Section  15  is  the  appendix. 
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2  Parametric  Uncertainty  in  Components 

One  natural  type  of  uncertainty  is  unknown  coefficients  in  a  state  space  model.  In  this 
section,  we  will  consider  a  special  class  of  state  space  models  with  unknown  coefficients, 
and  show  how  this  type  of  uncertainty  can  be  represented.  In  particular,  we  will  extract 
unknown  quantities  from  a  parametrically  uncertain  system  so  that  the  perturbations 
enter  the  system  in  a  feedback  form,  or,  using  the  term  we  will  later  introduce,  in  a  linear 
fractional  way.  This  type  of  modeling  will  form  the  basic  building  block  for  components 
with  parametric  uncertainty. 

After  setting  up  the  problem,  we  will  proceed  rather  informally,  manipulating  some  simple 
block  diagrams  to  arrive  at  the  special  representation  of  the  uncertainty.  These  types 
of  manipulations  are  (either  explicitly  or  implicitly)  common  to  the  rest  of  the  report 
particularly  section  8.  There,  while  the  proofs  we  give  are  precise,  they  tend  to  hide  the 
key  simple  idea  behind  each  particular  lemma.  It  is  useful  to  “draw”  the  block  diagrams 
pertinent  to  each  result,  as  this  makes  both  the  result  and  proof  clearer. 

Finally,  we  reformulate  the  robustness  problem  which  arises  when  controlling  such  uncer¬ 
tain  plants  into  a  linear  algebra  problem,  that,  eventually,  n  will  solve.  The  material  of 
this  section  is  motivated  by  the  discussion  in  [MorM]. 

2.1  Problem  description 

We  begin  with  an  explanation  of  the  matrix  and  block  diagram  notation  that  we  will  use 
throughout.  Cnxfc  and  RnXfe  are,  respectively,  all  complex  and  real  n  x  k  matrices.  Let 
M  €  Cnxfc.  As  usual,  MT  denotes  the  transpose  of  M,  and  M *  denotes  the  complex 
conjugate  transpose.  Suppose  u  and  v  are  complex  vectors,  with  u  6  Ck,  v  £E  Cn,  and 
v  =  Mu.  Pictorially,  we  will  draw  this  relationship  as 


u  '  v 

—  M  — 


Figure  2. 1  Pictorial  Notation  for  Matrix- vector  Multiplication 
Next,  suppose  M  €  C*ni+nj'x^1+,:2\  and  we  partition  in  the  obvious  way  as 

„  _  f  Mn  Mu  1 


[  Afji  M22  J 

with  Mij  €  Cn,x*>.  Now  if  for  i  =  1,2  we  have  u<  €  Cfc*  and  v,  G  Cn',  and  furthermore 
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1/1  =  Af  [  ^  ] ,  then  we  draw  this 

vi  L  u2  J 


ux  v,  v,  ~ n  u , 

— •  Af  „  A/,j - *  —  Af„  Ma  • - 

—  Af  a  A/ a  — ^  Af  „  Af  a  . - “J 

Figure  2.2  Pictorial  Notation  for  Partitioned  Matrix-vector  Multiplication 
When  we  need  the  norms  of  vectors  in  Cn  or  Rn,  unless  otherwise  stated,  ||  •  ||  will  represent 

n 

the  usual  euclidean  norm.  That  is,  for  v  £  Cn,  with  components  u,-  €  C,  ||u||2  :=  ^|u,|2. 

i=l 

Also,  consider  a  generic  finite  dimensional,  time  invariant,  linear  system,  described  by 

x  =  Ax  +  Bu 
y  =  C  x  +  Du. 

Note  at  every  instant  in  time,  x,  y ,  u,  and  x  are  related  by  the  simple  matrix- vector  mul¬ 
tiplication 

'  i  "I  _  [  A  HlTi' 
y  ~  C  D  u 

which  in  our  notation  is  drawn  as 


I  A  Dl-i 

4coU 

Figure  2.3  Pictorial  Notation  for  Time  Invariant,  Linear  System 

Now,  onto  the  problem.  Consider  a  n  dimensional,  linear  system  G$ ,  parametrized  by  k 
uncertain  parameters,  5j, . .  and  described  by  the  following  uncertain  equations 


x  =  (a  4-  £>A;)  x  +  (b  +  EM,)  u 


y  =  (c  +  £<5,C,j  X  +  \D  +  ES‘D'J  u- 

Here  A,  A,  €  R"x",  B,  B ,  £  RBXn-,  C,  C{  £  Rn»Xn,  and  D,  D,  £  Rn*Xn“. 

The  various  terms  in  these  state  equations  are  interpreted  as  follows: 

•  The  nominal  system  description,  given  by  known  matrices  A,  B,C,  and  D. 

•  Parametric  uncertainty  in  the  nominal  description. 

1.  All  of  the  uncertainty  in  the  model  is  contained  in  the  k  scalar  parameters 
Si,.. .  ,6k-  Various  assumptions  on  these  parameters  are  possible.  For  the  pur¬ 
poses  of  this  example,  we  will  assume  only  two  things  -  for  each  i,  <5,  £  [—1, 1], 


6 


s 

9 


and  they  do  not  vary  with  time,  they  are  fixed  (though  in  each  instance  that 
the  system  is  operated,  the  parameters  may  assume  different  values,  so  long  as 
they  are  in  the  unit  interval). 

2.  The  structural  knowledge  about  the  uncertainty  is  contained  in  the  matrices 
and  £);.  These  reflect  how  the  i’th  uncertainty,  affects  the  state 
space  model.  By  scaling  the  entries  in  these  4  matrices,  the  relative  effect  that 
6{  has  on  coefficients  is  varied.  Choosing  these  matrices  is  the  engineer’s  job, 
and  is  based  on  her  knowledge  of  the  physics  that  have  led  to  the  state  space 
equations. 

2.2  Linear  fractional  transformations 

Consider  the  “perturbed”  A  matrix  (or  B  or  C  or  D).  The  jl  element  of  this  matrix  is 
of  the  form  Ayi j  +  Y^i=i  A Note,  that  this  is  an  affine,  linear  function  of  the 
uncertainty. 

Can  this  model  be  expressed  in  the  following  form? 

x  =  Ax  +  Bu  +  f?2u2 
y  =  Cx  +  Du  +  D12u2 

t/2  =  C2x  +  Dull  +  Z?22u2  '  *  ' 

u2  =  diag  [<5j  J,  62I, ,  6kI }  y2 

In  other  words,  can  we  define  some  additional  inputs,  u2,  and  outputs,  j/2,  so  that  all 
the  uncertainty  in  the  equations  (2.1)  is  represented  as  a  nominal  system,  Gnom,  with  the 
unknown  parameters  entering  as  the  feedback  gains  that  close  the  loop  from  the  additional 
outputs  to  the  additional  inputs?  This  is  shown  in  the  figure  below. 
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Figure  2.4  Pictorial  Notation  for  Uncertain  System 
Recall  the  diagram  for  the  generic  linear  system.  Our  problem  is  then  reduced  to  finding 
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a  real  matrix  M  such  that  for  every  set  of  parameters  6,-,  the  following  picture  is  true, 


Figure  2.5  Representation  of  M 
In  this  case,  Gnom  would  just  be 


Finding  such  an  M  is  quite  easy.  Consider  a  matrix  M  partitioned  in  a  2  x  2  fashion  as 


below  left. 
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Figure  2.7  Pictorial  Notation  for  a  Linear  Fractional  Transformation 

If  we  close  the  bottom  loop  of  this  with  a  matrix  A  (as  above,  right),  then  the  matrix 
relating  z\  to  i>i  is 

Mu  +  Mn A  (/  -  M22 A)"1  M21 

assuming  that  the  inverse  exists  of  course.  Since  our  parameters  enter  equation  (2.1)  only 
affinely,  we  guess  that  our  Af22  can  be  chosen  to  be  zero. 

Indeed,  for  each  i,  let  <?,  denote  the  rank  of  the  matrix 

pi  :=  [  (2.3) 


Then  Pi  can  be  written  as 


p  _  A  If  A  1 
p%  -  Wi  J  Zi 


where  U  €  Rnx?>,  W,  €  Rn"x,\  Ri  €  RnXqi,  and  Z,  €  Rn“x<?\ 
Hence,  we  have 

6iPi  =  w{  % 

and  therefore  “our”  Mu  +  M12AM2i,  which  is 


A  + 

1=1  i=i 

k  k 

L  i=i  t=i 


in  fact  looks  like 


A  B 
C  D 


Therefore,  correct  definitions  for  the  matrices  Z?2,  C2,  Z?12,  -£>21?  and  Z?22  are 
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and  D22  =  0. 

The  uncertainty  is  contained  in  the  block  diagonal  matrix  A.  We  define  the  “block  struc¬ 
ture”  associated  with  this  system  as 

A  :=  {diag [«,/«,- :  Si  6  [-1,1]}  (2.6) 

Note  that  if  we  had  not  done  the  rank  reduction  (equations  2.3,  2.4,  and  2.5),  then  this 
structure  would,  in  general,  have  much  larger  dimensions. 

How  would  an  uncertain  parameter  enter  in  a  multi-rank  way?  Consider  a  system  with 
several  different  components,  each  of  whose  models  are  affected  in  a  linear  fractional  way 
by  an  something  external  to  the  system.  For  instance,  the  force/torque  producing  effec¬ 
tiveness  of  an  airplane’s  controllable  surfaces  (rudder,  aileron,  canard),  are  affected  by 
ambient  dynamic  pressure.  Suppose  that  for  each  surface,  the  model  of  its  effectiveness 
has  dynamic  pressure  entering  in  an  affine,  linear  fashion.  Then  each  surface  has  an  un¬ 
certainty  associated  with  pressure.  Since  these  different  surfaces  affect  the  airplane  in 
different  manners,  there  is  no  way  to  isolate  the  effect  of  dynamic  pressure  as  one  scalar 
<5bp.  Several  of  these  identical  scalars  are  necessary,  and  together  they  form  a  repeated 
scalar  block. 


Remarks:  Recall  that  the  uncertain  parameters  entered  both  the  state  equations  and 
ouput  equations  in  an  affine,  linear  fashion.  There  is  a  more  general  model  of  un¬ 
certainty  which  also  leads  to  the  “feedback”  representation  found  in  equation  (2.2). 
Each  entry  in  the  state  space  matrices  can  be  a  fraction  of  affine  multilinear 
combinations  of  the  uncertain  parameters.  For  example,  a  particular  per¬ 
turbed  entry  of  one  of  the  matrices  may  look  like 

fnom  +  f\ 82  +  /2< S3 

1  +  hi<52  +  /l2^i^253 

where  the  /’s  and  h's  are  known,  and  represent  how  the  uncertainty  affects  the 
matrix  entry  (our  example  in  this  section  has  all  of  the  h's  equal  to  0). 

These  models  for  uncertainty  are  called  linear  fractional,  and  will  be  explored 
more  in  section  8  and  9.  Unfortunately,  the  added  generality  in  (2.7)  as  compared 


to  (2.1)  introduces  some  difficulties  -  the  nice  uncertainty  rank  reduction  procedure 
(equations  2.3  -  2.5)  becomes  quite  difficult.  In  fact,  it  is  equivalent  to  finding 
minimal  realizations  of  multidimensional  (several  independent  variables)  systems. 
In  some  simple  problems,  it  is  easy  to  extract  the  minimal  number  of  uncertainties 
by  inspection.  More  generally,  it  is  possible  that  an  uncertainty  structure  much 
larger  (parameters  entering  many  times)  than  is  really  necessary  is  obtained.  From 
a  computational  viewpoint,  this  is  undesirable. 

Also  note  that  any  linear  connection  of  several  uncertain  components  (inputs  to 
separate  components  being  linear  combinations  of  outputs  of  separate  components) 
will  have  exactly  the  same  form:  all  of  the  parametric  uncertainty  can  be  isolated  in 
a  block  diagonal  “feedback”  around  a  known,  fixed  system. 


Now,  to  motivate  p,  and  the  theorems  in  section  8,  suppose  we  are  given  an  uncertain 
plant  in  the  form  (2.2),  and  a  linear,  time  invariant,  finite  dimensional  (LTIFD)  controller 
that  stabilizes  (feeding  back  y  to  u)  the  nominal  plant.  Under  what  conditions  does  it 
stabilize  all  of  the  perturbed  plants?  First,  let  the  stabilizing  controller  be  governed  by 
(  =  Ac(  +  Bcy  ;  u  =  CcC-  We  have  chosen  it  strictly  proper  just  to  simplify  some  of  the 
equations  (all  of  the  robustness  questions  can  be  addressed  for  controllers  with  D  terms). 
Define  the  following  matrices 


Mn 


A  BCc 
BcC  Ac  +  BcDCc 


M\2  := 


B2 

BcDi2 


(2.8) 


Mn  C2  D2\Cc  |  M22  :=  D22 


(2.9) 


With  tj  :=  [  *  ] ,  it  is  straightforward  to  check  that  the  perturbed  closed  loop  system  is 


Figure  2.8  Pictorial  Notation  for  Perturbed  Closed  Loop  System 
Hence,  to  guarantee  robust  stability,  we  need  to  verify  that  for  all  A  €  A  (recall  A  is  the 
appropriate  uncertainty  structure,  equation  2.6),  the  eigenvalues  of  the  matrix 


M11  +  M12A(/-M?2A)~1M21 


(2.10) 
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are  in  the  open  left  half  plane.  Alternatively,  if  the  problem  had  been  formulated  in  discrete 
time,  then  the  condition  would  involve  making  sure  the  eigenvalues  remained  inside  the 
unit  disc.  Actually,  this  type  of  test  is  more  directly  handled  by  //.  The  n  test  (Theorems 


S 


9.1  and  9.7)  is  applied  to  the  whole  matrix  M,  and  involves  not  only  the  structure  A 
representing  the  uncertainty,  but  an  augmented  structure  which  makes  sure  that  the 
test  checks  the  largest  eigenvalue  of  Mu  +  M12A  (/  —  M22A)-1  M2 1,  and  not  a  different 
quantity,  such  as  the  maximum  singular  value  of  this  perturbed  matrix.  This  is  made 
clearer  in  section  8  and  9.  As  we  have  mentioned  though,  computation  of  fi  is  difficult, 
and  that  is  the  real  issue  in  using  any  of  the  results. 


2.3  Real  vs.  complex  perturbations 


All  the  theory  presented  here  is  appropriate  for  robustness  analysis  with  complex  per¬ 
turbations,  and  not  for  real  perturbations  (as  in  the  example  in  this  section).  Hence, 
the  typical  assumption  we  will  impose  on  the  6,  in  A  in  (2.6)  is  actually  6,  G  C,  |<5,|  <  1 
for  each  i.  That  is,  instead  of  viewing  them  as  fixed  unknown  real  parameters,  they  are 
treated  as  fixed  unknown  complex  parameters.  As  we  will  see  in  section  11,  this  is  also 
equivalent  to  treating  them  as  stable,  finite  dimensional,  linear  time  invariant  systems, 
with  |&(<7u;)|  <  1  for  all  u  6  R.  Therefore,  if  a  particular  problem  has  uncertainty  that 
is  definitely  real  and  not  dynamical  (ie.  complex),  the  methods  here  will  be  conservative, 
since  the  smallest  offending  (destabilizing)  perturbation  will  almost  always  be  complex. 

It  is  often  very  natural  to  model  uncertainty  with  real  perturbations,  when,  as  in  this 
section,  the  real  coefficients  of  a  differential  equation  model  are  uncertain.  It  is  important, 
however,  to  remember  that  such  parametric  variations  are  in  a  model,  not  in  the  physical 
system  being  modeled.  Models  with  real  parametric  uncertainty  are  used  because,  in  prin¬ 
ciple,  they  allow  more  accurate  representation  of  some  systems.  Complex  perturbations 
are  typically  used  to  represent  uncertainty  due  to  unmodeled  dynamics,  or  to  “cover”  the 
variations  produced  by  several  real  parameters.  In  the  fi  framework,  complex  uncertain 
blocks  also  arise  for  problems  of  robust  performance. 

Although  computation  of  fi  for  complex  perturbations  is  nontrivial  and  there  are  important 
outstanding  issues  to  be  resolved,  as  indicated  in  this  report  substantial  progress  has  been 
made  and  fi  is  being  applied  routinely  to  large  engineering  problems.  Computation  of  fi 
for  real  perturbations  is  fundamentally  more  difficult  than  for  complex  perturbations. 

The  major  issues  in  computing  /j,  or  its  equivalent,  are  the  generality  of  the  problem 
description,  the  exactness  of  analysis,  and  the  ease  of  computation.  With  existing  methods 
for  real  perturbations,  you  get  to  choose  two.  A  general  and,  in  principle,  exact  method  is 
a  brute  force  global  search  using  a  grid  of  parameter  values  (e.g.  Horowitz,  Ackermann). 
This  inevitably  involves  an  exponential  growth  in  computation  as  a  function  of  the  number 
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of  parameters  and  taking  fewer  grid  points  to  avoid  this  gives  up  exactness.  Progress 
is  being  made  in  reducing  the  computational  burden  of  exact  methods  ([deGS],  [SidG], 
[SidP]),  but  nothing  suggestive  of  polynomial-time  algorithms  is  available. 

An  approach  to  obtaining  exact  results  with  modest  computation  is  to  restrict  the  problem 
description.  The  best  example  is  Kharitonov’s  celebrated  result  for  polynomials  with 
coefficients  in  intervals.  Unfortunately,  it  is  almost  impossible  to  find  models  with  any 
engineering  motivation  that  fit  the  allowable  problem  description.  Again,  progress  is  being 
made  in  this  direction  by  allowing  more  general  uncertainty  descriptions  at  the  expense  of 
more  computation. 

The  approach  taken  in  [FanTD]  could  be  characterized  as  being  very  general  and  computa¬ 
tionally  attractive,  but  potentially  inexact.  Following  the  methods  developed  for  complex 
/i,  the  main  idea  is  to  get  upper  and  lower  bounds  using  local  search  methods  which  are 
computationally  cheap,  but  may  fail  to  find  global  solutions.  One  then  seeks  to  prove  that 
the  local  methods  yield  global  solutions,  or  that  the  bounds  one  gets  are  tight  enough  to 
be  of  value  in  problems  of  interest.  This  strategy  has  been  very  successful  for  complex  n 
and  appears  to  have  promise  for  the  real  case  as  well,  although  it  is  clear  that  the  real  case 
is  much  more  challenging. 


3  Structured  Singular  Value 


3.1  Definitions 

This  section  is  devoted  to  defining  the  structured  singular  value,  a  matrix  function  denoted 
by  p  (•).  We  consider  matrices  M  €  Cnxn.  In  the  definition  of  p  (M),  there  is  an  underlying 
structure  A,  (a  prescribed  set  of  block  diagonal  matrices)  on  which  everything  in  the 
sequel  depends.  For  each  problem,  this  structure  is  in  general  different;  it  depends  on 
the  uncertainty  and  performance  objectives  of  the  problem.  Defining  the  structure 
involves  specifying  three  things;  the  type  of  each  block,  the  total  number  of  blocks,  and 
their  dimensions. 

There  are  two  types  of  blocks-  repeated  scalar  and  full  blocks.  Two  nonnegative  integers,  s 
and  /,  represent  the  number  of  repeated  scalar  blocks  and  the  number  of  full  blocks,  respec¬ 
tively.  To  bookkeep  their  dimensions,  we  introduce  positive  integers  r1? . . . ,  rs;  m1? . . . ,  rrif. 
The  i’th  repeated  scalar  block  is  r,  x  r j,  while  the  j’th  full  block  is  rrij  x  rrij.  With  those 
integers  given,  we  define  A  as 

A  ={diag  [^1/ri,...,Wr.,Ai,...,A/3:6,  €C,AieCm>xm>}cC’lXn  (3.1) 

For  consistency  among  all  the  dimensions,  we  must  have 

*  / 

+  =n> 

«=l  j= i 

Often,  we  will  need  norm  bounded  subsets  of  A,  and  we  introduce  the  following  notation 

BA  =  {A  6  A  :  a(A)  <  1}  (3.2) 

Note  that  in  (3.1)  we  have  put  all  the  repeated  scalar  blocks  first.  This  is  just  to  keep  the 
notation  as  simple  as  possible,  in  fact  they  can  come  in  any  order.  In  any  case,  we  will 
see  that  every  problem  can  always  be  set  up  (by  rearranging  rows  and  columns  of  M )  so 
that  they  appear  first ,  so  we  are  not  losing  any  generality  in  this  formulation.  Also,  the  full 
blocks  do  not  have  to  be  square,  but  restricting  them  as  such  saves  a  great  deal  in  terms 
of  notation.  This  restriction  is  without  loss  of  generality,  since  p  for  nonsquare  blocks  can 
be  converted  to  p  for  square  blocks  by  adding  rows  and/or  columns  of  zeros  to  M. 


Definition  3.1  For  M  €  CnXn,  (same  dimensions  as  the  elements  of  A)  p&  ( M )  is  defined 


unless  no  A  €  A 


MAW)  ••= 

makes  I  +  M A 


_ 1 _ 

min  {o  (A)  :  det  (/  -f  M A)  =  0} 

A€  A 

singular,  and  then  p&  ( M )  =  0. 


(3.3) 


IkV 


An  alternative  expression  follows  almost  immediately  from  the  definition. 


Lemma  3.2  pa(M)  =  p(M A) 


In  view  of  this  lemma,  continuity  of  the  function  p  :  CnXn  — >  R  is  apparent.  In  general, 
though,  the  function  p  :  CnXn  — ►  R  is  not  a  norm,  since  it  doesn’t  satisfy  the  triangle 
inequality.  However,  for  any  a  G  C,  p  ( aM )  =  |a|/j  (M),  so  in  some  sense,  it  is  related  to 
how  “big”  the  matrix  is. 

We  can  easily  calculate  pa  (M)  when  A  is  one  of  two  extreme  sets. 

•  If  A  =  {£/  :  6  €  C}  (s  =  1,  /  =  0,  r!  =  n),  then  pa  ( M )  =  p  (M),  the  spectral  radius 
of  M. 

Proof:  The  only  A’s  in  A  which  satisfy  the  det  (I  +  M A)  =  0  constraint  are  neg¬ 
ative  reciprocals  of  nonzero  eigenvalues  of  M.  The  smallest  one  of  these  is 

associated  with  the  largest  (in  magnitude)  eigenvalue,  so,  pa  ( M )  =  p(M).  f 

•  If  A  =  CnXn  (s  =  0,/  =  l,mi=n),  then  p&.  ( M )  =  a(M) 

Proof:  If  a  (A)  <  then  cr(M A)  <  1,  so  / -f  M A  is  nonsingular.  Applying 

equation  (3.3)  implies  p&  ( M )  <  a  (M).  On  the  other  hand,  let  u  and  v 
be  unit  vectors  satisfying  Mv  =  a(M)u,  and  define  A  :=  Then 

<r  (A)  =  and  I  +  M A  is  obviously  singular.  Hence,  p&  ( M )  >  d-(M).  j) 

Obviously,  for  a  general  A  as  in  (3.1)  we  must  have 

{6I:6(E  C}cAcCnxn.  (3.4) 

Hence  directly  from  the  “minimization”  in  the  definition  of  p,  and  the  two  simple  cases 
above,  we  can  conclude  that 

p(M)  <  pA(M)  <a(M)  (3.5) 

These  bounds  alone  are  not  sufficient  for  our  purposes,  because  the  gap  between  p  and 
<7  can  be  arbitrarily  large.  We  refine  them  by  considering  transformations  on  M  that  do 
not  affect  p&  (M),  but  do  affect  p  and  a.  To  do  this,  define  the  following  two  subsets 
of  Cnxn 

Q  =  {Q  €  A  :  Q-Q  =  /„}  (3.6) 


V  =  {diag  :  D,  €  Cr,Xr‘  is  invertible,  d,  ±  o| 

Note  that  for  any  A  £  A,  Q  £  Q,  and  D  £  D, 

Q'eQ  Q  AgA  AQeA  d(QA)  =  d(AQ)  =  d(A) 


DA  =  AD 


Consequently,  we  have: 


Theorem  3.3  For  all  Q  €  Q  and  D  £  V 


Ha  ( MQ )  =  ,iA  ( QM )  =  txA  (M)  =  ma  (DMD'1) 


(3.10) 


Proof:  For  all  D  £  V  and  A  €  A, 

det  (I  +  M A)  =  det  (/  +  MD_1AD)  =  det  (/  +  DMD"1  A) 

since  D  commutes  with  A.  Therefore  pA(M)  =  pA(DM  D~l).  Also,  for  each 
Q  €  2,  det  (/  -f  A/A)  =  0  if  and  only  if  det  (/  4-  MQQ* A)  =  0  .  Since  g*A  £  A 
and  d(Q*A)  =  d(A),  we  get  / iA{MQ )  =  ha{M)  as  desired.  The  argument  for 
QM  is  the  same.  # 


Therefore,  the  bounds  in  (3.5)  can  be  tightened  to 


max p{QM)  <  fiA{M)  <  {D  M  D~l) 


(3.11) 


An  important  question  is  “when  are  the  bounds  in  (3.11)  actually  equalities?”.  This 
question  is  a  nontrivial  one,  and  a  large  portion  of  this  report  is  devoted  to  answering  it. 
The  results  we  will  subsequently  show  are 


•  The  lower  bound,  max/)  ( QM ),  is  always  equal  to  pA  (A/).  Unfortunately,  the  func- 

Q€Q 

tion  l(Q)  :=  p(QM)  has  local  maxima  which  are  not  global,  and  computing  the 
global  maximum  of  such  functions  is,  in  general,  impossible. 

•  In  contrast  to  the  local  phenomena  described  above,  the  function  u(D)  :=  d  ( DMD'1 ) 
does  not  have  any  local  minima  which  are  not  global,  so  computing  Jnf  d  (DA/D-1) 

is  a  reasonable  task.  In  general  though,  pA  (A/)  <  jnf  d  (DMD-1).  For  certain 
block  structures  A,  equality  always  holds.  The  general  situation  is  summarized  in 
the  table  below. 
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st,  >= 


yea 

eaay 

no 

Sec.  12.1 


yes 

easy 

yes 

Sec.  7.2 


„  yes 

Sec.  7.1.1 

no 

Sec.  12.2 


yes 

Sec.  7,1.3 
no 

Sec.  7.1.2 


no 

Sec.  7.1.2 


When  is  the  upper  bound,  inf  a  ( DMD  always  equal  to  /i  ? 


The  section  number  in  each  box  indicates  where  the  detailed  analysis  can  be  found 
in  this  report. 

3.2  Simple  results  &:  special  cases 

In  this  section,  we  derive  simple  expressions  and  bounds  for  p  in  a  few  special  cases.  We 
begin  with  a  class  of  matrices  for  which  we  can  derive  an  easy,  explicit  expression  for  p. 
This  will  be  done  directly  from  the  definition,  independent  of  the  upper  and  lower  bounds 
just  described. 


Theorem  3.4  Let  nl5  n2,  mi  and  m2  be  positive  integers,  and  consider  matrices  of  the 
form 

°  Ml2  (3  12) 

Mn  0  [6-U) 

where  A/12  G  C”lXmj,  A/21  G  CnjXmi  and  the  zero  entries  are  of  the  appropriate  dimensions. 
Consider  a  perturbation  set  A  of  the  form 

A  =  {diag  [Alt  A2]  :  Ax  G  CmiXni,  A2  G  Cm’xnj} 
ie.  two  full  blocks.  With  respect  to  this  structure, 

P  (A/)  =  ^/o(Mi2)  <t(M21). 

Proof:  Let  M  be  any  matrix  as  in  (3.12),  and  let  A  G  A.  It  is  straightforward  to  verify 
that  det  (/ +  A/A)  =  det  (/ —  A/2iAi  A/12A2).  Denote  (A/12)  <7  (M2\)  by  7,  and 
suppose  that  A  G  A  is  chosen  with  <7  (A)  <  L.  Then  d  (A/21A!A/i2A2)  <  1  which 
means  that  I  —  M2\  AiA/i2A2  is  nonsingular,  and  hence  I  +  A/ A  is  nonsingular.  This 
gives  a  lower  bound  on  the  “minimum”  part  of  the  definition  of  p,  namely 

min  {<7 (A)  :  det  (/  +  A/A)  =  0}  >  — 

AeA  7 
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Obviously  then,  from  (3.3) 

H(M)<  7  (3.13) 

Actually  (3.13)  is  an  equality;  to  see  this  let  u,u,v  and  v  be  unit  vectors  of  appro¬ 
priate  dimension  that  satisfy 


Muv  =  cr  (Mu)  u  ,  Mnv  =  cr  (Mu)  u 


Define  the  dyads 


a  1  -  -  a  1  -« 

Ai  —  ~  vu  ,  A2  =  —  vu  . 

7  7 


Let  A  =  diag  [Ai,  A2].  Obviously  <r  (A)  =  L,  and  (I  —  M2i A1M12A2)  u  =  0,  hence 
1  +  MA  is  singular,  and  therefore  p  ( M )  >  7.  jf 


The  same  result  was  proven  in  [NetU],  using  a  main  result  of  [Doy].  Here,  we  used  only 
the  definition  of  p  and  simple  linear  algebra. 


The  next  example  gives  a  easy-to-compute  upper  bound  for  rank  deficient  matrices  with 
arbitrary  block  structures. 


Theorem  3.5  Suppose  M  G  CnXn  has  rank  r,  r  <  n.  Then  we  can  write  M  =  LRm, 
where  L,R  G  Cnxr.  Partition  L  and  R  compatibly  with  the  block  structure  as 

f  Li  I  f  1 


(3.14) 


so  that  L{ ,  Ri  €  Cr*xr  and  K{,  Si  G  Cm,xr.  Then 


a(R‘Li)  +  X>  (Si)  <t  (Kj). 

.=1  j= 1 


Proof:  For  any  A  €  A 

det  (I  +  MA)  =  det  (I  +  LRmA) 


=  det(I  +  RmAL) 


=  det  /  +  £  S^-L,  +  £  S’AjKj 
V  >=I  3=1 


(3.15) 


If,  for  some  /?  >  0,  we  can  show  that  A  G  A,  a  (A)  <  4  implies  that 


<7  £^*^  +  £5^/^  <1 


S# 

w* 


m 


>«» 


then  for  all  those  A,  det  (/  +  M A)  ^  0  (by  (3.15))  and  hence  p  <  p. 
It  is  easy  to  find  such  a  /?.  Suppose  that  A  G  A  and 


J2<r(R;Li)  +  'E*(Si)°(K,) 

»=i  i=i 


Then 


*  [  ±  SiK-Li  +  £  Sj  Aj/fj  j  <  £  |4|*  (*?!<)  +  £  *  (A_,)  *  (5j)  *  (Kj)  <  1 

\*=i  i=i  /  «=i  j=i 


Therefore  p  (M)  <  £  *  (A*!,)  +  £  *  (5,-)  *  (/T,).  # 
*=i  j=i 


Theorem  3.6  Let  M  G  CnXn  be  given,  and  suppose  that  M  has  rank  equal  to  1.  Write 
M  =  LRm,  and  partition  L  and  R  compatibly  with  the  block  structure  as 


(3.16) 


so  that  Li,  Ri  €  Cr'xl  and  K„  Si  6  Cm’xl.  Then 

p(M)  =  ±\R:Li\  +  Z\\SJ\\  ||A;| 

>=i  >=i 


(3.17) 


Proof:  For  notational  simplicity,  let  7  :=  ^  ||Sj||  ||Aj|j.  Obviously  from 

*=1  >= 1 

theorem  3.5  we  already  have  p  <  7.  With  M  a  dyad,  we  will  actually  show  that 
it  is  an  equality.  For  each  i  <  s,  choose  G  C,  [<7i|  =  1,  so  that  qiR'L,  is  a  real, 
nonpositive  number.  Similarly,  for  each  j  <  /,  choose  a  unitary  matrix  Q}  so  that 
SjQjKj  =  —  ||Sjj|  ||A;jj.  These  two  steps  can  always  be  done.  Suppose  that  7  /  0. 
Then  define 

A  :=  ^  diag  [qilri,. .  .,q,Ir.,Qi,  Qj]  G  A  (3.18) 

By  construction,  <r(A)  =  — ,  and  I  +  MA  is  singular,  therefore  p&  (M)  >  7,  so  using 

7 

theorem  3.5,  we  get  the  equality  as  claimed.  Jj 


» 


4  Proof  that  lower  bound  achieves  /x 


Recall  the  two  bounds  we  derived  in  section  3.1. 

max  p  ( QM )  <  p  (M)  <  mf  a  (Mir1) 

A  main  result  of  [Doy]  is  that  for  any  block  structure  A  as  defined  by  (3.1),  the  left  hand 
side  of  the  bound  above  is  actually  an  equality: 

Theorem  4.1  Let  A  be  a  given  block  structure,  and  let  the  set  Q  be  defined  by  (3.6). 
Then  for  every  matrix  M  of  appropriate  dimensions, 

p  (M)  =  max  p(QM).  (4.1) 

We  begin  by  stating  a  well  known  result  from  complex  analysis  called  Rouche’s  theorem 
[Rud]. 

Theorem  4.2  Let  T  be  a  simple  closed  contour  in  the  complex  plane,  C.  Let  f  and  g  be 
functions  which  are  analytic  inside  and  on  I\  If  \g‘ z)\  <  \f{z)\  on  T,  then  f  and  f  +  g 
have  the  same  number  of  zeros  inside  F. 

This  is  used  in  proving  the  next  lemma,  which  is  the  well  known  result  stating  that  the 
roots  of  a  polynomial  are  continuous  functions  of  the  coefficients  of  the  polynomial. 

Lemma  4.3  Let  f(z)  =  £”_0  a,z'  be  an  n  ’th  order  polynomial,  a„  ^  0.  Let  zx,z2,  ...  ,zn 
be  the  zeros  of  f.  For  any  t  >  0  and  any  integer  m  >  0,  there  exists  a  <5m,e  >  0  such  that 
if  g(z),  defined  by 

ff(z)  =  b,z' 

t=0 

has  coefficients  6,  €  C  which  satisfy  |6,|  <  Sm<e,  then  there  are  n  zeros  of  f  -f  g ,  labeled 
Zi,z2,.. .  ,zn  that  satisfy  | z,-  —  z,|  <  e. 


Hence  the  zeros  of  /  depend  continuously  on  the  coefficients  of  the  polynomial  (even 
leading  coefficients  which  are  zero). 

Next,  we  shift  our  attention  to  polynomials  in  several  dimensions,  that  is,  polynomials 
taking  Ck  — *  C.  If  z  €  Cfc,  we  let  HzH^  :=  max  |z,|.  For  p:Ck  — >C,  a  polynomial,  define 

i<k 

0P  as 

=  min{|jz|!oo  :p(z)  =  0}  (4.2) 


/5p  is  the  norm  of  the  “smallest”  zero  of  the  polynomial.  The  next  lemma  is  from  [Doy]. 

Lemma  4.4  Let  p  be  a  polynomial  from  Ck  —*  C.  Define  j3p  via  (4.2).  Then  there  exists 
a  z  G  C*  such  that  |z,|  =  0P  for  each  i,  and  p(z)  =  0. 

Proof:  Let  z  be  a  minimizing  solution,  so  p(z)  =  0  and  ||z||oo  =  Pp-  If  \zi\  =  0P  for  all  i, 
then  we  are  done,  so  assume  that  \zr\  <  (3P.  Now  we  can  (always)  write 

n 

P(z)  =  Y^P'(Z  !’•••  1  zr-\i  ZT+li  •  •  •  ,Zk)z‘T  (4.3) 

1=0 

where  the  p,  are  polynomials  in  all  the  variables  except  zT. 

For  notational  purpose,  we  denote  p,-  as  the  polynomial  p,  evaluated  at  z  (of  course, 
it  doesn’t  depend  on  zr),  that  is 

Pi  :=  Pi  (^1)  •  •  •  l  Zr- 1,  £r+l  )  •  •  •  t  Zk) 

and  we  let  L  denote  the  set  of  integers  {1,2,...  ,  r  —  l,r  +  1, . . . ,  k). 

There  are  three  situations  we  need  to  consider: 

1.  Suppose  that  for  every  i,  pi  =  0.  Then,  regardless  of  the  value  of  zr,p(z )  =  0. 
In  particular,  the  magnitude  of  zr  may  be  adjusted  to  be  fip  and  z  will  still  be 
a  root  of  p. 

2.  Suppose  that  po  ^  0,  but  p,  =  0  for  i  >  1.  Quick  checking  reveals  that  this  is 
not  possible,  since  then  p(z)  ^  0  as  we  need. 

3.  Suppose  that  for  some  i  >  1,  pi  ^  0.  Then  zr  is  a  zero  of  the  nontrivial 

n 

polynomial  q(zT)  =  ^ Z,P'zr ■  Let  e  >  0  with  \zr\  +  e  <  f3p.  By  the  lemma,  we 

i=° 

can  find  a  6  >  0  such  that  if  |g,-  —  p,|  <  6  for  each  i ,  then  the  polynomial 

n 

q(zT)  :=  YL<lizr  w°uld  have  a  zero  zT  satisfying  \zT  -  zr\  <  e.  Since  the  pt  are 
«=o 

continuous  functions  of  their  k  —  1  arguments,  we  can  find  a  6  >  0  such  that  if 
1C.  —  Zi\  <  6  for  all  i  €  L  ,  then  there  is  a  zT  with  |zr  —  zr\  <  e,  such  that 

n 

^  Pi  (Cl  i  •  ■  •  i  Cr-l ,  Cr+1 1  •  •  ■  i  Ck)  Zt  ~  0 
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In  particular,  we  could  choose  all  the  £  to  have  smaller  magnitude  than  the 
respective  zt.  Therefore  the  point 

r  Ci 


Cr-l 

ir  |  6  C‘ 

Cr+1 

a  J 

<  /?p,  but  is  a  root  of  p{z).  This  contradicts  the  definition  of  (3P, 


has 


hence  this  situation  cannot  occur. 

Therefore,  we  have  shown  that  if  z  is  a  minimizing  solution,  ie,  ||z||oo  =  f3p  where 
/3P  =  min  {Hzlloo  :  p(z )  =  0},  then  we  may  as  well  assume  that  each  of  the  components 
of  z  has  magnitude  equal  to  j3p.  # 

This  is  the  lemma  necessary  to  prove  that  the  lower  bound  is  an  equality. 

Theorem  4.5  Let  A  be  a  given  block  structure,  and  let  Q  be  defined  as  in  section  3.1. 
Then  for  every  matrix  M  of  appropriate  dimensions, 

max  p{QM)  =  p{M) 

Proof:  This  is  obvious  if  p  ( M )  =  0,  so  we  will  assume  that  p  (M)  >  0.  Let  A  €  A  be  a 
minimizing  solution,  so  det  (I  +  M A)  =  0,  and  o  (A)  =  y.  Do  a  singular  value 
decomposition  on  each  block  that  makes  up  A.  This  gives  U,V  €  Q,  and  a  diagonal 
Eg  A,  such  that 

det  (/  +  MUtV *)  =  0 
Since  E  €  A  and  is  diagonal,  it  appears  as 

E  =  diag  IT , , . . .  ,  S9Irt ,  ai , . . .  , 

for  some  complex  numbers  6,  and  a  j,  and  w  =  i  mj ■  (recall  the  j’th  full  block  is 

rrij  x  mv  hence  each  full  block  contributes  m;  of  the  a’s) 

Consider  s  +  w  complex  variables,  Zj, . . .  ,  za+w.  Define  a  variable  E  by 
E  =  diag  [zj  /ri , . . .  ,  z,/rj ,  Zj+i  , . . .  ,  zs+w"\ 
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Then  det(/  +  MUHV *)  is  a  polynomial  on  C*+<4',  since  the  determinant  involves 
only  multiplications  and  additions.  By  hypothesis,  a  minimum  norm  root  of  this 
polynomial  has  an  infinity  norm  (as  defined  above)  of  =:  7.  Let  E  be  the 
minimizing  root  with  all  components  of  equal  magnitude,  namely  7.  Then  we  can 
write  E  =  7$  for  some  $  €  2-  This  gives 

det  (I  +  1MU$Vm)  =  0. 

Obviously  p(MU$ V*)  >  p(M),  and  the  product  U$V*  E  Q,  so  we  are  done.  # 


5  Preliminaries  for  study  of  upper  bound 


The  next  major  undertaking  is  a  careful  study  of  the  upper  bound:  its  computational  prop¬ 
erties,  and  the  relation  between  n  and  the  upper  bound.  The  purpose  of  this  section  is  to 
collect  some  mathematical  facts  that  we  will  need.  All  of  the  upcoming  material  appeared 
first  in  [Doy],  although  the  theorems  for  the  upper  bound  there  are  less  general.  Here  we 
generalize  the  theorems  in  [Doy]  to  include  block  structures  with  repeated  scalar  blocks. 
Initially,  we  will  focus  on  the  a  (D M D'1)  upper  bound  and  begin  by  reparametrizing  it. 


5.1  Reparametrization  of  the  upper  bound 

For  the  sake  of  computation,  and  proving  some  theorems,  we  must  eliminate  a  degree  of 
freedom  present  in  the  D' s  as  they  are  defined  now.  From  now  on,  we  will  assume  that 
there  is  always  at  least  1  full  uncertainty  block,  so  that  /  >  1.  The  case  with  s  >  2  and 
/  =  0  is  handled  separately  in  section  12.1. 

First,  note  that  for  any  nonzero  a  6  C,  and  any  DeD, 

a  (DMD-1)  =  a  ({aD)  M  ( aD )'x)  .  (5.1) 

Hence,  in  calculating  the  infimum,  we  can  use  this  scaling,  and  without  loss  in  generality, 
always  assume  that  dj  =  1.  Since  we  will  have  occasion  to  use  it  again  though,  we  will 
now  refer  to  the  original  set  V  as  defined  in  (3.7)  as  Vg. 

In  addition,  we  may  assume  that  the  other  d,  are  positive,  and  the  £)<  are  positive  definite. 
To  see  this,  take  D  £  V  and  do  a  polar  decomposition,  D  —  UP  with  U  unitary  and 
P  =  P*  >  0.  Obviously 

a  [DMD~X)  =  cr  ( UPMP~lUm )  =  a  (PMP~X)  (5.2) 

by  the  unitary  invariance  of  a.  Hence  for  any  D  €  T>,  there  is  a  positive  definite,  hermitian 
Du  €  V  that  achieves  the  same  a.  Therefore,  the  following  definition  for  T>v 

Vp  =  {diag  7m/]  :  D{  =  D*  6  Cr(Xri  >  0  ,d,  >  o} 

(5.3) 

leaves  the  infimum  the  same.  Note  that  implicitly,  the  last  block  has  df  —  1  as  we 
indicated  above. 

We  do  one  further  reparametrization  via  logarithms.  Recall  that 

{ew  :W=W  £  Cmxm}  =  {D:D  =  D‘e  CmXm,  positive  definite  }  (5.4) 


24 


% 


a 

$ 

A. 


$ 

sJ 


'.u 

% 

\ 


;» 

y 


1 


if 

8 

•;! 

,i\ 


L_ 


This  simply  says  that  the  set  of  exponentials  of  all  hermitian  matrices  is  equal  to  the  set 
of  positive  definite,  hermitian  matrices.  The  obvious  block  diagonal  version  of  this  fact 
allows  us  to  redefine  V  as 

V  :=  {diag  [A,...  . . .  ,^,7^,0^]  :  D{  =  D*  €  Cr<xrSd,  €  R}  (5.5) 


and  the  upper  bound  as 


Pa(M)  <  inf  a(eDMe  D). 


We  note  that  T>  is  a  finite  dimensional,  real  (scalar  multiplication  must  be  real)  vector 
space. 


5.2  Convexity  of  the  Upper  Bound 

In  this  section,  we  prove  that  the  reparametrized  upper  bound  is  convex  in  the  variable 
D.  Therefore,  any  local  minimum  is  also  global  minimum.  Hence  gradient  optimization 
methods,  which  can  yield  local  minima,  can  be  used  to  nonconservatively  compute  the 
upper  bound  for  p.  The  first  proof  of  this  can  be  found  in  [SafD],  Here,  we  take  an 
approach  from  [ChuD]. 

Definition  5.1  Let  X  be  a  vector  space.  A  function  f  :  X  — +  R  is  convex  if  for  every 
x,y  ex,  A<=  [0,1] 

f  (Ax  +  (1  —  A)  y)  <  A/  (x)  +  (1  —  A)  /  (y) 

The  next  lemma  gives  a  sufficient  condition  for  a  continuous  function  to  be  convex.  It  is 
fairly  intuitive  and  is  taken  from  [ChuD].  The  proof  is  in  the  appendix. 

Lemma  5.2  Let  /:R— >R  be  a  continuous  function,  and  suppose  for  each  t0  6  R,  there 
exists  a  twice  differentiable  function  gto:  R— +  R,  such  that  f(t0 )  =  gto(tD),  f(t)  >  gto{t)  for 
all  t  €  R  and  ^#*-1  _  >0.  Then  f  is  a  convex  function. 

\t — to 

We  apply  this  to  our  situation. 

Lemma  5.3  For  every  D  €  P,  the  function  f:  R-*R,  f(t)  :=  o  (eDlMc"Dt)  is  convex. 
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Proof:  We  just  need  to  verify  the  hypothesis  of  the  lemma.  Let  t0  be  given  and  let  uto 
and  vto  be  complex  unit  vectors,  of  appropriate  dimensions  such  that 

u*oeDt°Me~Dt°Vt0  =  a  (eDt°Me~Dt°)  . 

For  later  use,  we  will  let  cr0  denote  a  {tDto and  M0  :=  eDt°Me~Dt°. 

Define  gto :  R— ►  R  by 

gto(t)  =  Re  [t t;oeDtMe~Dtvto]  (5.7) 

Obviously,  f(t0)  =  gt0(t0),  and  for  all  t  6  R,  f(t)  >  Differentiating  (5.7)  twice 


=  [  ulD  vtoD  ]  a°J  ~Mo  Duto 

-M0  <t0I  Dv, 


Recall  that  c(M0)  =  cr0,  hence  the  matrix  in  (5.8)  is  positive  semidefinite,  and 
cPgt 

therefore  ■  "  >  0.  By  Lemma  5.2,  /  is  convex,  jj 

t~t0 

Trivially,  we  wrap  this  all  up  with 

Lemma  5.4  Consider  the  function  h:V—* R,  h(D)  =  a  (eD Me~D^ .  Then  h  is  convex. 

Proof:  Let  D\  and  Di  be  arbitrary  elements  in  D,  and  let  A  £  [0, 1].  We  need  to  show 
that 

h  ((1  -  X)Di  +  A D2)  <  (1  -  A )h(Di)  +  A h(D2) 

Define  /:R-+R  by  f(t)  :=  h((\-t)Dx+tD2)  =  <r  [e6<  e-6*]  where 

D  is  defined  D  :=  —  D\.  Now,  /  is  convex  by  Lemma  5.3,  therefore  for  every 

t  6  [0, 1] 

/(«)<U- 0/(0) +  */(l)  (5.9) 

Note  that  /( 0)  =  h(D\)  and  /( 1)  =  ft  (A).  Therefore,  setting  t  =  A  in  (5.9),  we 
have 

ft  ((1  -  A)A  +  A  A)  <  (1  -  A  )h(Dt)  +  A  h(D2)  (5.10) 

as  desired,  jj 


$ 

m 


r»V»’ 
iHv*’ 

m 


Rw 

Rv* 

U*lVi 


w 

I 

I 

tSi 


s 

r&» 


I  "tf  4"#  *'1.1 


r i.i  {^1.)  |»;  vl  iN  V4Vl.i‘Li'*,r».l  t.i  ».i  M  ‘it 


5.3  Directional  derivatives  of  coalesced  singular  values 

The  minimization  problem  for  the  upper  bound  is  discussed  here.  We  calculate  the  first 
derivatives  of  singular  values  of  eDtMe~Dt  for  given  D  in  V.  The  resulting  formula  will  be 
used  in  section  7  to  find  a  D  G  V  such  that  for  t  >  0,  sufficiently  small,  a  (eP Me~D'sj  < 
a  (A/),  in  other  words,  a  descent  direction  for  a.  Iterating  on  this  is  a  method  to  calculate 
the  upper  bound.  In  general,  the  minimization  for  the  upper  bound  will  drive  the  top 
singular  values  together,  since  we  are  minimizing  a  “max”  function.  Therefore,  we  must 
carry  out  the  derivative  calculations  for  coalesced  singular  values  (ie.  multiplicity  greater 
than  1).  Derivatives  of  distinct  singular  values  are  just  special  cases  of  the  following  results. 

A  result  from  perturbation  theory,  ([Kat]  for  the  theory,  [FreLC]  and  [Doy]  for  this  appli¬ 
cation)  that  we  will  use  freely  is  that  if  T:R— ►  Cnxn  is  an  analytic  function  mapping  the 
real  fine  into  hermitian  matrices,  then  there  exist  analytic  matrices  {/(•),  and  A(-),  such 
that  for  all  t ,  U(t)  G  CnXn,U*(t)U(t)  =  7,  A (t)  G  RnXn,  A (t)  diagonal,  and 


T(t)U(t)  =  U{t)A(t). 


(5.11) 


In  other  words,  the  eigenvalues  of  an  analytic  hermitian  matrix  are  analytic,  and  there 
is  a  choice  of  orthogonal  analytic  eigenvectors  as  well.  We  use  this  result  to  derive  an 
expression  for  the  derivatives  of  nonzero  singular  values  of  an  analytic  matrix. 

Let  W  :R— »C"xm  be  an  analytic  function  of  the  real  variable  t.  Suppose  cr  is  a  nonzero 
singular  value  of  W(0)  with  multiplicity  r.  Then  cr2  is  a  eigenvalue  of  W(0)W*(0),  also  with 
multiplicity  r.  Hence,  there  are  analytic  functions  [/„(•),  Ub(-),  £„(•),  and  A(,(-),e  >  0,  such 
that  for  all  t  G  (-e,  e),  Ua{t)  G  Cn*r,Ub(t )  €  Cnx(n"r\  Ea(t)  G  RrXr,  \b(t)  G  R(""r)x("-r) 
with  both  Ea  and  Ab  diagonal  and  nonnegative  for  t  G  (— e,  e).  At  t  =  0,  £o(0)  =  <r/r,  and 
none  of  the  diagonal  entries  of  Aj,(0)  are  equal  to  cr2.  We  also  have  that  for  all  t  G  (— e,  e) 


l[«/.(o  ub(t)\=in> 


(5.12) 


W(t)W*(t)  =  UaWl(t)U:(t)  +  Uh(t)Ab(t)U;(t)  (5.13) 

We  want  to  calculate  the  derivatives  (at  t  =  0)  of  the  r  singular  values  which  are  coalesced 
at  cr  at  t  =  0.  Of  course,  these  are  just  the  diagonal  entries  of  Sa,  which  itself  is  diagonal. 
Roughly  speaking,  we  will  differentiate  (5.13)  to  get  an  explicit  formula  for  Sa. 


Dropping  the  explicit  t  dependence,  and  post-multiplying  (5.13)  by  Ua(t)  we  have 

WW‘Ua  =  UaZ* 


(5.14) 


$ 


V  L'» 


Differentiating  this  gives 


WWTJ  -L  wiv' 


Premultiply  this  by  £/*,  and  evaluate  at  t  =  0.  Recall  that  at  t  =  0,  E0  =  aIT.  Hence,  at 
t  =  0 


U2WWU.  +  u:ww-ua  +  (J2u:ua  =  o2U‘aUa  +  2<rEa 


Two  terms  cancel,  and  since  <7  ^  0  by  assumption,  we  axe  left  with 

±a  =  YU:  {ww*  +  WW*)  Ua  (5.15) 

Actual  computation  of  the  derivatives  requires  one  additional  computation.  Consider  a 
singular  value  decomposition  of  W( 0), 

tt'(O)  =  cUxV*  +  C/2E2F2*  (5.16) 

Since  the  singular  vectors  associated  with  repeated  singular  values  are  not  unique,  U\ 
need  not  be  equal  to  Ua  (0).  But,  both  have  orthogonal  columns,  and  they  span  the  same 
subspace  in  Cn,  therefore,  there  is  a  unitary  matrix  K  £  Crxr  such  that 

Ua(  0)  =  UXK  (5.17) 

Substituting  (5.16)  and  (5.17)  into  (5.15)  gives 

KtaK'  =  i  (U{WVX  +  V^W-Ux)  (5.18) 

Since  K  is  unitary,  this  is  a  similarity  transformation,  hence  the  derivatives  of  the  r  singular 
values  coalesced  at  cr  are  the  eigenvalues  of 

\  (u;wvx  +  vfw-Ux) 

Let  us  do  the  above  calculations  for  the  special  case  we  need. 


Theorem  5.5  Suppose  W(t)  is  of  the  form  eDtMe~Dt  where  D  £  V  and  M  is  given. 
Obviously  W(0)  =  M  and  VF(0)  =  DM  —  MD.  Hence  if 

w{  o)  =  m  =  c tUxv;  +  u2  s2y;  (5.19) 

then  the  derivatives  of  the  clustered  singular  values  at  a  are  the  eigenvalues  of 

(tUIDUx  -  crVfDVx  (5.20) 

In  particular,  let  Ai,  A2,...,Ar  be  the  eigenvalues  of  U*DU\  —  VfDV i  .  They  are  real 
because  this  matrix  is  hermitian.  At  a  nonzero  value  of  t,  the  r  singular  values  that  were 
<j  at  t  =  0  satisfy 


where  lim  iAA  =  Q. 
(-o  t 


<r,(t)  =  <t(1  +  A  it)+gi{t) 


(5.21) 
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Hence  if  we  can  find  a  Z?  £  1?  with  all  the  eigenvalues  of  U^DUi  —  V{DV i  negative, 
then  by  moving  a  small  amount  in  that  direction,  all  of  the  singular  values  in  the  cluster 
will  be  reduced. 

After  reviewing  some  results  from  convex  analysis  in  the  next  section,  we  will  address  the 
problem  of  finding  a  D  £  V  such  that  for  small  t,  all  of  the  singular  values  in  the  cluster 
are  reduced.  As  we  have  shown  here,  this  is  equivalent  to  finding  a  D  £  V  such  that  all 
the  eigenvalues  of  U*DUi  —  V*DVi  are  negative. 


5.4  Convexity 

This  section  is  devoted  to  some  simple  results  from  convex  analysis  which  will  subsequently 
be  used  to  find  D  €T>  such  that  all  the  eigenvalues  of  U*DU  —  V*DV  are  positive.  This 
gives  a  descent  direction  for  o  {eDMe~D^.  All  of  the  results  here  are  from  [Roc]. 

Let  X  be  a  real,  finite  dimensional  vector  space,  with  inner  product  (■,  •)  :  X  x  X  — +  R, 
and  let  V  be  a  compact  subset  of  X.  The  main  question  this  section  addresses  is  “does 
there  exist  a  point  x  £  X  and  0  >  0  such  that  min(x,  y)  >  0  ?” 

The  following  definitions  and  results  are  standard. 

Definition  5.6  A  subset  V  C  X  is  convex  if  Au  +  (1  —  A)  v  £  V  for  every  u,v  £  V  and 
A  £  [0, 1]. 

Definition  5.7  For  a  subset  V  C  X  the  convex  hull  of  V,  co  (V)  is  the  smallest  convex 
set  containing  V: 

co(V)=  f|  T  (5.22) 

f convex 

Lemma  5.8  For  all  V  C  X,  co(V)  is  convex.  IfV  is  convex,  then  co(V)  =  V.  If  V  is 
compact,  then  co(V)  is  compact. 

Lemma  5.9  The  convex  hull  of  V  C  X  is  all  finite  convex  combinations  of  points  in  V. 
That  is 

{m  m  \ 

22  oiiXi :  m  €  N,  a,-  €  [0, 1],  ]T  a,  =  1,  x,  £  V  >  (5.23) 

i=i  x=i  J 


Lemma  5.10  Let  V  be  a  compact  subset  of  X.  Then  there  is  a  unique  point  x  £  co(V) 
such  that  ||x||  =  min  {|jy||  :  y  £  co(V)}.  When  clear,  we  denote  this  as  x  =  min  (coV). 


ess  esa  a  g&s  ess  s&j  $•?  ^ 


Lemma  5.11  Let  V  be  &  compact  subset  ofX,  and  let  x  G  co  (V)  be  the  unique  minimizer 
desribed  above,  ie.  j|x||  =  min  {||y||  :  y  G  co(V)}.  For  any  z  G  co(V) ,  ( z,x )  >  ||x||2. 


Lemma  5.12  Let  x  G  X.  If  ||x||  >  ||min  (coV)  |j,  then  there  is  a  y  G  V  such  that 
(x,y)  <  \\x\\2. 

These  give  rise  to  the  main  theorem. 


Theorem  5.13  Let  V  be  a  compact  subset  ofK.  There  exists  x  G  X  such  that  min(x,  y)  > 
0  if  and  only  if  0  g  co  (V). 


The  minimum  point  of  the  convex  hull  of  a  set  V  can  be  found  via  an  iterative  algorithm, 
due  to  [Gil].  Important  extensions  of  this  are  found  in  [Wol]  and  [Hau].  All  the  algorithms 
have  one  main  computational  requirement:  for  each  x  G  X,  we  need  to  be  able  to  generate 
a  point  yx  G  V  such  that 

{x,yx)  =  nun(x,y)  (5.24) 


Note  since  V  is  closed,  there  always  is  such  a  yx ,  though  it  may  not  be  unique. 

The  algorithm  from  [Gil]  is  as  follows:  Define  a  sequence  {x,}^  in  the  convex  hull  of  V 
via  the  following  rules: 


a.l  Pick  any  point  xx  G  coV.  In  particular,  Xj  can  be  any  element  of  V. 

a.2  Given  x,-,  pick  y,  6  V  to  minimize  the  inner  product  as  above  in  equation  (5.24). 


a.3 


Define  x,+i  =  minco  {x,,  y,}.  Obviously,  x,+i  G  coV.  Return  to  a.2. 


Hauser’s  algorithm  [Hau]  makes  a  more  intelligent  choice  for  x,+i,  using  not  only  x,  and  y,, 
but  past  values  of  j/j  as  well.  It  is  a  generalization  of  Wolfe’s  algorithm  [Wol]  for  polytopes. 
In  any  event, 

Claim:  The  sequence  {x<}  converges  to  the  minimum  point  in  the  convex  hull  of  V. 


Proof  of  claim:  Obviously,  the  sequence  {x,}  has  ||x<+i||  <  [|xt||  for  each  i.  Therefore 
both  sequences  {x,}  and  {y,}  are  bounded,  hence  we  can  choose  a  subsequence  {n^}  so 
that  x„k  — >  x  and  ynk  — *  y.  Since  both  coV  and  V  are  closed,  we  have  x  G  coV  and  y  €  V. 
By  continuity,  and  step  [a.2]  of  the  algorithm,  it  is  easy  to  show  that  (x,  y)  =  min (x,  y). 

y€V 

Now  suppose  that  x  ^  min(coV).  Since  x  G  coV,  we  have  by  Lemmas  5.10  and  5.12  that 


•J  f.l 
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which  contradicts  that  the  sequence  {||xj||}  is  nonincreasing.  Therefore  x  =  min(coV). 
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Consequently,  ||x||  is  larger  than  ||min  (co  {x,  y})  ||.  Let  e  >  0  be  the  difference. 

J 

_  i 

e  :=  ||x||  -  ||min  (co  {x, y})  ||  >  0 

(5.26) 

t 

K 

Now  the  function  min  (co  {•,•}):  X  x  X  — >  X  is  a  continuous  function.  Hence  there  is  a 

•r. ! 

m  m  * 

integer  K  such  for  all  k  >  K, 

*»*  * 

|| min  (co  {xnjfc,y„4})  j|  -  ^  <  ||min(co{x,y})  || 

(5.27) 

This  implies  that  for  all  k  >  K 

11*11  >  ||min(co{xnjk,ynik})||  +  | 

(5.28) 

§8 

Finally,  it  is  an  easy  fact  to  show  that  if  {z*}  is  a  sequence  in  a  compact,  convex  set, 
G,  with  the  norm  satisfying  ||zfc+ij|  <  ||zjt||  for  all  integers  k ,  and  there  is  a  subsequence 
{znk}  converging  to  min (Q),  then  in  fact,  the  sequence  itself  is  convergent,  with  limit  of 
course  being  min  (G).  Hence  the  sequence  we  generate,  {x,},  does  indeed  converge  to  the 
minimum  point,  jj 

In  the  next  section,  we  consider  the  problem  of  finding  a  matrix  D  €  V,  such  that  all 
of  the  eigenvalues  of  U^DUi  —  V*DVX  are  positive.  Recall  from  Theorem  5.5,  this  is 
equivalent  to  finding  a  “descent  direction”  for  the  function  a(ecMe'°j.  This  problem 
can  be  formulated  naturally  into  a  “minimum  point  in  convex  hull”  formulation  as  we  have 
covered  here.  We  also  show  that  finding  a  point  yx  G  V  that  minimizes  the  inner  product 
(x,  y)  can  be  cast  as  a  hermitian  eigenvalue  problem. 


6  Upper  bound  and  the  structured  singular  value 


6.1  Finding  Descent  Directions  for  the  Upper  bound 

6.1.1  Defining  a  Generalized  Gradient  Set 

Our  problem  of  finding  a  D  6  D  such  that  all  the  eigenvalues  of  U*DU  —  V*DV  are 
positive  can  be  attacked  using  the  convexity  results  from  the  previous  section.  The  results 
are  quite  nice,  and  computationally  tractable.  The  motivation  comes  from  [Doy],  though 
this  section  generalizes  the  results  there. 

We  consider  square  matrices,  CnXn,  and  a  compatible  block  structure  A,  with  integers 
ri,...,r„  defining  the  dimensions  of  the  blocks,  as  outlined  in  section  3.1. 

Define  X  to  be  the  following  set  of  block  diagonal,  hermitian  matrices: 

X  :=  {  diag  [Zu . . . ,  Z„  zx, . . . ,  zs.x]  :  Z<  =  Z*  6  ,  z,-  €  R}  (6.1) 

This  is  a  real  inner  product  space  (of  dimension  £i=i  rl  +  /  —  1)  with  inner  product 
defined  by 

P,  T  €  X  (P,T)  :=tr(PT)  (6.2) 

which,  in  terms  of  the  blocks  that  make  up  P  and  L  is  just 

(/>,!■)=  iX/W)  +  (6.3) 

>=1  3=1 

Remark:  When  there  are  only  full  blocks,  s  =  0,  then  X  is  the  set  of  (/  —  1)  x  (/  —  1), 
diagonal,  real  matrices,  with  the  obvious  inner  product.  In  those  instances,  we  will 
identify  X  with  R/_1. 

Recall  the  definition  for  V  in  (5.5).  Let  D  €  V  be  given.  Then  D  looks  like 

D=  diag  [Di, . . . ,  D„  d\Imi , . . .  ,0m/]  (6.4) 

where  D{  =  D‘  6  Cr,Xr<  and  dj  €  R.  Associate  to  this  D  6  P,  a  D  €  X  by  setting 

D  =  diag  [£?i,...,D„  dlt...,d/_i]  (6.5) 


Note  the  natural  one  to  one  correspondence  between  the  elements  of  V  and  X. 


Now,  let  M  €  Cnxn  be  given.  If  the  maximum  singular  value  of  M,  cr,  has  multiplicity 
equal  to  r,  then  M  is 

M  =  aUV  +  U2  E2V2*  (6.6) 

where  U,V  £  Cnxr,  U'U  =  =  Ir,  U2,V2  £  Cn*(n~r'>,  U;U2  =  V2*V2  =  7(n_r),  and 

E2  €  R<*-'>*<»-'>  is  diagonal,  positive  semidefinite,  and  none  of  its  diagonal  entries  are 
equal  to  cr. 


Recall  that  we  want  to  find  a  D  £  V  such  that  all  the  eigenvalues  of  U*DU  —  V*DV  axe 
positive,  or  in  other  words,  Amin  >  0.  Using  Theorem  5.5,  for  such  D,  then  with  t  <  0, 
sufficiently  small  in  magnitude, 


a  ( eDtMe~Di )  <  a 


and  hence  computation  of  the  ^inf  <r  [eP  Me  D  j  depends  on  finding  these  D. 
For  notational  purposes,  partition  U  and  V  compatibly  with  A  as 


■  Ax  ■ 

'  Bi  ■ 

A* 

e2 

U  = 

B, 

F\ 

.Ef. 

.Ff. 

(6.7) 


(6.8) 


where  €  CriXr,  E„  F,  £  Cm'xr. 

With  this  notation 

U'DU  -  V'DV  =  £  (A'DiAi  -  B;D,Bt)  +  £  d,  (E’E:  -  FJF:) 

•=i  ;=i 


(6.9) 


Therefore,  since  this  matrix  is  hermitian,  A^,,  ( U‘DU  —  V‘DV)  is  just 


'min  mill  i 

v€Cr 

INI=i 


/-i 


E  WD,Ai  -  B-DiB,)  +  Y,d,  (E-E,  -  F'F,) 

.*=  1  J  =  1 


(6.10) 


Exchanging  the  order  of  multiplication,  and  taking  traces  yields  the  equivalent  form 


This  can  be  rewritten  using  inner  products  as 


=  nun  ( D ,  Pv) 

V€Cr 

11*711=1 


(6.12) 


where  Pv  6  X  is  defined  by 


P?  “  AurfAf  -  BiT)T)*B* 
P 1  ■=  *  (E-Et  -  F-Fj) 


(6.13) 


Let  Vjy  C  X  be  the  set  of  all  such  Pv.  That  is 


VM  :=  {  diag  [P^...,P:,pI...,pU]  •  as  in  (6.13),  r,  €  C',\\V\\  =  l}  .  (6.14) 


Recall  that  when  r  >  2,  the  matrices  U  and  V  (which  in  turn  define  A ,  B ,  E  and  F  above) 
are  not  unique.  It  is  easy  to  verify  that  the  set  Vjvf  does  not  depend  on  the  particular 
choice. 


Then,  for  a  given  D  €  T>  (and  corresponding  D  6  X)  we  have 


Amin  {U*DU  —  V'DV)  =  min  {D,P). 


(6.15) 


Hence,  it  is  the  set  VM  that  determines  whether  or  not  there  is  a  D  that  gives  >  0. 
The  next  theorem  follows  directly  from  equation  (6.12)  and  Theorem  5.13. 


Theorem  6.1  There  exists  a  D  Z  T>  such  that  Amj„  ( U'DU  —  V'DV)  >  0  if  and  only  if 

0  £  co(Vm). 


If  0  €  co(Vm)  then  for  every  D  €  V,  Amjn  <  0  and  Am**  >  0.  Hence  to  first  order,  the 
maximum  singular  value  either  increases  or  stays  the  same  (we  are  at  a  stationary  point). 
By  convexity  of  a  (eDMe'c|,  we  see  that  we  are  at  a  global  minimum.  To  summarize: 


Theorem  6.2  cr(M)  =  ^nf^o-  [cDMe~D^  if  and  only  if  0  €  co(Va/). 


On  occasion,  we  will  abuse  the  notation  adopted  above.  When  the  matrix  in  question, 
in  this  case  M,  is  clear  from  the  context,  we  will  drop  the  subscript  and  just  write  V. 


Finally,  we  address  the  problem  of  computing  the  point  of  minimum  norm  in  the  convex 
hull  of  Va/.  As  mentioned  in  section  5.4,  for  each  D  £  X,  we  need  to  be  able  to  find  a 
P()  €  Vm  that  achieves 

{D,Pd)  =  mmJD,P).  (6.16) 


y.v.A’ 


EjS 

Pft1! 


raj 
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and  dj  for  j  —  1 , . . .  ,  /  —  1 .  Then 


min  (D,  P)  =  min  rf  £  (A’DiAi  -  B’D ,£,)  +  £  dj  (E-E,  -  Ff  F3)  tj  (6.17) 

f€V“  n€Cr  -=1  :=l 

11-711=1  sLl_  J  J 


Obviously,  the  numerical  value  of  this  is  just  the  minimum  eigenvalue  of  the  hermitian 
matrix  W.  Let  t)w  €  Cr  be  any  unit  length  eigenvector  associated  with  this  eigenvalue, 
then 

argpimn  =  diag  [P?“, . . . ,  P?'‘,pT“, . . .  ,p)w_^\  €  VM  (6.18) 

where  the  P’s  and  p’s  are  defined  as 
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Pi”  ■■=  AiT)wT)w"A*  -  BiT}wTjw*B- 
pT  ■■=  Vw  {E'E,  -  FfFj)  rjw. 


(6.19) 


for  each  i  and  j. 


Using  this  formula,  and  the  algorithm  in  [Hau],  we  can  find  the  minimum  point  in  the 
convex  hull  of  Vm  as  desired. 

6.1.2  A  Property  of  V  when  M  is  real 

If  the  matrix  M  is  real,  then  the  minimum  point  in  the  convex  hull  of  V  is  real.  We  will 
prove  this,  and  then  see  the  implication  it  has  on  computing  ^nf  a  {eD  Me~D^.  Roughly 
speaking,  each  block  of  the  optimal  D  €  V  can  be  chosen  to  be  real,  symmetric. 

Theorem  6.3  If  M  is  real,  then  for  any  block  structure  A,  the  minimum  point  in  the 
convex  hull  of  Vjif  is  real. 

Proof:  Since  M  is  real,  both  U  and  V  in  the  SVD  of  M  may  be  taken  as  real.  Now 
recall  the  algorithm  to  find  min(coVA/)  as  described  in  the  last  chapter.  We  can 
pick  x\  to  be  any  element  of  coV*/.  If  we  choose  an  arbitrary  real  unit  vector  771, 
then  our  initial  point  Xi  is  real.  Obviously  then,  the  point  yx  may  be  chosen  real 
too.  Simple  induction  gives  that  with  this  choice  of  xi,  the  entire  sequence  {x,}  is 
real.  It  converges  to  the  minimum  point,  which  therefore  must  be  real.  j| 

This  leads  to  the  next  theorem. 
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Theorem  6.4  Let  T>n  be  the  set  of  real,  symmetric  members  ofT>.  If  M  is  real,  and  the 
infimum 

inf  a  (eDnMe~DR)  (6.20) 

Dr£Vr  '  ' 


is  achieved,  then  in  fact 


inf  <J  (eDMe~D)  =  inf  a  (eDRMe  Dr)  jj 
Dev  v  >  drg.vr  v  /  * 


(6.21) 


We  make  a  conjecture  that  this  is  true  even  when  the  infimums  are  not  achieved,  but  the 
details  are  not  worked  out  here. 

Conjecture  6.5  Let  T>r  be  the  set  of  real,  symmetric  members  of  V.  If  M  is  real,  then 


inf  a  (eDMe  D )  =  inf  o  (e°RMe  Dr) 

DeV  V  )  Dr€Vr  \  / 


(6.22) 


6.2  When  /i  =  a 

The  results  of  this  section  relate  the  upper  bound  to  y. 

As  usual,  let  A  be  a  given  structure,  and  let  M  be  a  given  complex  matrix.  In  the  last 
section  we  showed  that  a{M)  =  info  if  and  only  if  0  €  co(Vm)-  A  natural 

question  is:  “When  does  <r(M)  =  y&(M)  ?”.  The  answer,  which  will  link  the  upper  bound 
and  y  together,  is  the  subject  of  the  next  theorem.  Again,  the  set  V  plays  a  crucial  role. 

Theorem  6.6  a(M)  =  y&(M)  if  and  only  if  0  € 

Remark:  This  is  exactly  the  result  obtained  in  [Doy],  [Doy]  however  only  considers 
structures  with  full  blocks  (s  =  0).  This  section  generalizes  that  result  to  structures 
with  repeated  scalar  blocks  as  well. 

Proof:  For  the  proof,  we  follow  the  style  of  [Doy],  and  prove  the  equivalence  of  four 
statements: 

1.  0  €  Va i 

2.  There  exists  77  €  Cr,  ]|7?||  =  1  and  Q  £  Q  such  that  QUtj  =  Vtj 

3.  There  exists  £  €  Cn,  ||£||  =  1  and  Q  €  Q  such  that  QMf,  =  <r£ 

4.  <t(M)  =  y*{M) 


1 


1  — >■  2  :  From  the  definition  of  V,  (6.14),  0  G  Vjv/  implies  that  for  some  rj  €  Cr, 

AiT)T]*Aim  —  BiT]T]*Bi *  =  0  i  <  s 
V’  (Ej'Ej  —  Fj'Fj)T]  =  0  j  <  /  -  1 
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=  1 

(6.23) 


Obviously,  for  i  <  s,  there  is  a  phase  ej6,  such  that  e?9'Aif\  =  BiT).  For  j  <  f  — 
1,  =  ||Fj>7j|,  so  there  exists  a  unitary  matrix  Qj  such  that  QjEjTj  —  FjT}.  The 

only  thing  left  is  the  last  full  block.  Since  \\Ur)\\  =  || V77U  we  must  have  ||£/r/||  = 
||F/r/||.  This  gives  a  unitary  matrix  Qf  with  QjE/T)  =  Fjt).  Arranging  the  phases 
and  Q’s  in  a  block  diagonal  fashion  gives  statement  2. 

2  — »  1  :  This  follows  along  the  lines  of  1  — »  2. 

2  — »  3  :  The  matrix  M  has  a  SVD  of  M  =  <tUV*  +  f/2S2Vr2*.  Hence  QM(Vr 7)  =  aQUrj  = 

crVr).  Defining  £  =  Vrf  gives  statement  3. 

3  —  2  :  A  SVD  of  QM  is 

QM  =  d(QU)V  +  (Q[/2)£2V2*  (6.24) 

If  QM£  =  <r£,  then  £  must  lie  in  the  subspace  spanned  by  the  right  singular  vectors 
associated  with  a.  Hence  there  is  a  vector  77,  satisfying  if  =  V77.  Obviously  \\rj\\  =  1 
and 

QUr,  =  QUV*(,  =  ^QMZ  =  i  =  Vri.  (6.25) 

<7 

3  — >  4  :  QM£  =  implies  that  p^{M)  =  ma xp(QM)  >  p(QM)  >  a{M).  However  a  is 

always  an  upper  bound  for  p  hence  we  must  have  equality. 

4  — ►  3  :  This  is  obvious  by  Theorem  4.5.  ^p(M)  =  ma xp(QM)^  jj 

Theorem  6.6  is  extremely  important  in  determining  when  the  upper  bound  gives  p.  The 
idea  is  to  find  Da  €  T>  such  that  0  G  co(Ven0^fe-cc,).  This  can  in  principle  be  done  using 
a  steepest  descent  method,  and  the  facts  about  V  in  section  6.1.  Then,  we  know  that 


p  ( M )  =  p  (eDoMe~Do^  <  ^nf  a  (cD Me"0')  =  <7  (eDo Me~D°^  . 
If,  in  fact  0  G  VeD<,Me-o0,  then  by  Theorem  6.6  we  must  have 

p  (eD°Me~D°^)  =  a  (eD°A/e_Do) 


so  that 


p  (M)  =  <7  [eDMe 


(6.26) 

(6.27) 

(6.28) 


£•  3 


£ 


8 


* 


I 

5=1 

Vi 


» 


:j  §1 

SF 


8 


$5 


>  55 
£  $ 


1 

rv* 


►  *  *  * 

Si 

v\ 

?:  Jo 

$  I 

w  Si 

j 

•T-  ^ 

‘  £ 


Therefore,  if  the  block  structure  A  imparts  the  property  on  V  such  that  0  G  co  (V)  implies 
0  6  V,  then  we  will  always  have  p&  ( M )  =  ^nf^  [eD 

A  technical  point  we  have  not  addressed  is  when  the  “inf”  is  not  achieved.  In  that  case  the 
above  reasoning  cannot  be  used  directly,  since  we  never  actually  get  0  G  co  ( V).  However, 
everything  still  works  (this  proof  also  rigorizes  the  above  arguments): 

Theorem  6.7  If  the  block  structure  A  has  the  property  that  0  G  co(V)  always  implies 
0  €  V,  then  p&  (M)  =  ^jnf  <7  ( eDMe~D  J . 

Proof:  Let  /?  =  inf  a  (^eD M e~D^ .  Let  Dk  be  a  sequence  in  T>  such  that  cr  ( eDkMe~Dk ) 
converges  to  as  k  — ►  oo.  Denote  Wk  =  eDk  Me~Dk .  Since  the  sequence  Wk  is 
bounded,  it  has  a  convergent  subsequence  with  limit  W .  Obviously,  by  continuity  of 
a  and  p,  d(W)  =  (3  and  p(M)  =  p(W).  We  claim  that  0  G  co(Vw).  Suppose  not, 
then  there  exist  D  G  V  and  e  >  0  such  that  cr  (ePWe~D^  =  (3  —  e.  Choose  k  so  that 


\\Wk  —  W||  <  — -  -g  ,  where  k  (•)  denotes  condition  number.  Then 

6  I 


which  yields 


\eD{Wk  -  W)  t 


|e6Wfce-S||<^-|. 


(6.29) 


(6.30) 


This  contradicts  that  0  was  the  infimum,  hence  indeed  0  G  co(Vw).  By  hypothesis, 
this  means  0  G  so  by  Theorem  6.6,  p(W)  =  a(W).  Recalling  continuity,  we  get 
p&(M)  =  j3  as  desired. D 

In  the  section  to  follow,  we  will  determine  some  structures  for  which  the  hypothesis  of 
Theorem  6.7  always  holds.  Therefore,  for  such  structures,  the  upper  bound  will  always 
equal  p. 

To  conclude  this  section,  consider  the  minimization  over  the  D' s.  Typically,  since  we  are 
minimizing  the  maximum  singular  value,  the  top  singular  values  tend  to  coalesce,  so  that 
at  the  minimum,  the  multiplicity  of  a  is  greater  than  or  equal  to  2.  This  is  typical  of 
any  “min  max”  problem.  Suppose  though,  that  at  the  minimum,  a(M)  was  distinct. 
Obviously,  since  we  are  at  a  minimum,  we  must  have  0  G  co(V).  But  if  the  multiplicity 
of  b  is  only  1,  then  V  is  a  single  point,  and  hence  V  =  {0}.  This  reasoning  gives: 

Corollary  6.8  If,  at  the  minimum  of  a  (eD  Me"°j ,  the  maximum  singular  value  has  mul¬ 
tiplicity  of  1,  then  p  (M)  =  min  a  [eDMe~D^j  . 
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7  Properties  of  the  set  Va/ 

With  the  machinery  presented  in  the  last  section,  we  can  now  explore  the  relationship 
between  the  upper  bound  an  fi  for  a  variety  of  block  structures. 

7.1  Block  structures  with  no  repeated  scalar  blocks 

We  begin  with  block  structures  having  no  repeated  scalar  blocks,  that  is,  when  s  =  0.  All 
material  here  is  taken  from  [Doy]  and  [MorD]  and  is  included  for  completeness. 

7.1.1  2  full  blocks 

The  situation  with  two  full  blocks  is  relatively  simple.  Referring  back  to  (6.13)  and  (6.14), 
we  see  that  V  will  always  have  the  form 

V  =  {77*  ( E'E  -  F*F)  r,:Ve  Cr,  |M|  =  1}  (7.1) 

for  some  given  r  >  0  and  E,F  6  Cm‘xr.  Since  EmE  —  F'F  is  hermitian,  V  is  just  a  closed 
interval  in  the  real  line.  Obviously,  this  is  always  convex,  so  if  0  6  co(V),  we  in  fact  have 
0  6  V.  Hence  by  theorem  6.7  we  have: 

Theorem  7.1  If  A  consists  of  two  full  blocks  (s  =  0,/  =  2),  then 

/•a(M)  =  mf  i(e°A/e-D).  (7.2) 

Remark:  The  two  block  case  was  first  solved  in  1959  by  Redheffer  [Red].  His  approach 
is  quite  different.  Interestingly,  it  uses  a  form  of  Schauder’s  fixed  point  theorem, 
[DunS]  and  hence  does  not  boil  down  to  just  simple  linear  algebra.  Similarly,  the 
method  of  proof  here  uses  the  analyticity  of  eigenvalues  of  an  analytic  matrix,  which 
is  also  a  nontrivial  fact.  It  would  be  quite  nice  if  simpler  proofs  existed,  but  none 
are  known. 

Also,  this  is  a  fairly  simple  thing  to  compute.  Recall  that  for  two  full  blocks,  there  is 
only  one  free  parameter  in  the  set  V ,  consequently,  the  computation  is  a  one  dimen¬ 
sional  search  on  a  convex  function.  The  only  drawback  is  that  the  cost  evaluation  is 
a  a  evaluation,  which  while  not  exceedingly  difficult,  is  nonetheless  time  consuming. 
Note  that  a  search  need  not  involve  gradient  calculations,  hence  the  code  can  be 
quite  simple. 


7.1.2  4  full  blocks 


Consider  the  case  when  A  consists  of  four  lxl  blocks,  so  s  =  0,  /  =  4,  and  rrij  =  1  for 
each  j.  Let  a,  b,  and  c  be  positive  real  numbers,  d  and  /  be  complex  numbers,  and  and 
he  real  angles.  Define  matrices  U,  V  6  C4*2  by 


a 

o  1 

r  o 

a 

b 

b 

T/  _ 

b 

-b 

c 

jc  ’ 

V  — 

c 

-jc 

d 

/ 

e^d 

For  the  time  being,  suppose  that  these  are  both  unitary  matrices,  so  that  U*U  =  V“V  = 
J2.  Later  we  will  actually  assign  the  correct  values,  but  at  the  moment  we  just  assume 
this  is  already  done.  Then  define  M  €  C4*4  by 

M  :=  UVm  (7.4) 

With  the  assumptions  of  unitariness  on  U  and  V,  (7.4)  is  a  singular  value  decompostion 
of  M.  M  has  two  singular  values  at  1,  and  two  singular  values  at  0.  With  respect  to 
the  block  structure  A  that  we  have  defined,  what  properties  does  the  set  V m  have?  In 
particular: 

•  is  0  €  co(Va/)?  If  so,  then  ^inf^  (eDMe~D)  =  1,  otherwise,  it  is  less  than  1. 

•  is  0  e  V*f?  If  so,  then  fi  (M)  =  <r(M)  =  1,  otherwise  it  is  less  than  1. 


Since  the  multiplicity  of  the  maximum  singular  value  is  2,  we  can  parametrize  all  unit 
vectors  in  C2,  and  get  a  parametric  representation  of  V^-  It  is  easy  to  see  that  any  vector 
1 1  €  C2,  with  H^ll  =  1  is  of  the  form 

_  e7^1  cos  6 

^  sin  9 

for  some  real  <f>\,<f>2,  and  9.  As  it  turns  out,  V*/  depends  only  on  the  difference  <f>i  —  <f>2, 
which  we  will  denote  as  <f>. 

Simply  plugging  in  for  the  definition  of  Vm  from  section  6.1.1,  we  get 

a 2  (cos2  9  —  sin2  O') 

Vm  =  '  462  sin0  cos#  cos<f>  €  R3  '■  <f>,9  £  R  ■  C  R3  (7-5) 

„  L  4c2  sin  9  cos  9  sin  <f>  J 

It  is  apparent  that  0  ^  Vjvf-  That  would  require  (from  the  first  coordinate  in  (7.5))  that 
9  =  for  some  integer  n.  The  second  and  third  coordinates  being  zero  would  then 

require  both  cos  <f>  =  0  and  sin  4>  =  0,  which  is  impossible.  Hence  0  $  Va/,  and  ^  ( M )  <  1. 
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On  the  other  hand,  setting  9  =  0,  and  then  9  ~  gives  that  both  [a2  0  Of  and  [—a2  0  of 


axe  elements  of  V\f.  Consequently,  0  €  co  (V^).  Therefore 


inf  a  ( eDMe~D )  =  a(M)  =  1 


In  order  to  complete  the  counterexample,  we  must  choose  the  free  variables  so  that  U  and 
V  in  (7.4)  are  unitary,  as  we  said  we  could.  In  fact  here,  we  will  choose  them  so  that 
is  the  boundary  of  a  ball  in  R3,  centered  at  the  origin.  The  radius  happens  to  be  — 
This  particular  choice  of  parameters  was  obtained  via  alot  of  algebra. 


Set  7  =  3+  y/3  and  /?  =  i/3  —  1  and  define 


4..M 


=  -- 
w  2 


V>2  =  * 


Some  algebra  later,  we  conclude  that  Vm  is  the  set  of  all  x  E  R3,  such  that  ||x||  =  5^73. 
Obviously,  0  £  VM,  but  0  €  coV^.  Extensive  searching  over  the  set  2  in  the  lower  bound 
formula  (recall  that  while  the  lower  bound  is  always  p,  unfortunately,  it  is  not  a  concave 
function,  so  gradient  methods  yield  only  local  maxima)  has  revealed  that  for  M  defined 
above,  fi  ( M )  is  approximately  0.874. 


Therefore,  for  the  4  full  block  problem,  as  opposed  to  the  2  full  block  problem,  in  general, 
/j  (M)  /  ^nf^  (eDMe*Dj.  Since  the  full  blocks  in  this  counterexample  are  lxl,  they 
may  be  viewed  as  repeated  scalar  blocks  as  well.  Therefore  this  counterexample  proves 
that  for  every  block  structure  A  satisfying  s  +  /  >  4,  in  general,  we  will  have 


/*  W  /  Bid{e°Me  D)- 


7.1.3  3  Full  Blocks 


In  view  of  the  2  previous  sections,  the  only  case  with  s  =  0  that  we  don’t  know  about  is 
3  full  blocks.  In  this  section,  we  will  prove  that  indeed,  V  is  always  convex,  and  hence  for 
every  matrix  M,  the  infimum  upper  bound  is  equal  to  /i.  Recall  that  if  A  consists  of  3 
full  blocks  (s  =  0, /  =  3),  then  V  is  of  the  form 


v  =  {[  ]  €  R1  :  V  6  Cr,  II  I)||  =  l}cR' 


for  some  integer  r,  and  hermitian  matrices  Hi  and  H2  G  CrXr.  Obviously,  if  r  =  1,  then 
the  set  V  is  a  single  point,  so  it  is  convex.  The  next  3  lemmas  will  show  that,  for  any 
positive  r,  this  is  also  convex. 

We  begin  with  some  notation  from  [Doy].  For  any  positive  integer  r,  we  define  the  sets 
Pr  :=  {i  6  C'  :  j|a;|j  =  1}  and  ST  :=  {u  G  Rr+1  :  j|u||  =  1}.  If  Hi,H2,...  ,Hq  are  hermi¬ 
tian  matrices  in  Crxr,  we  define  a  function  /# :.Pr— ►R?  by 

'  V'HiV  ' 

T}*H2T) 

Mn)  ••=  :  e  R?  (7.7) 

for  each  q  G  PT  ■ 

Lemma  7.2  Let  q  be  a  positive  integer.  Let  a,,  c,  G  R,  and  bi  6  C  for  i  =  1, . . .  ,  q.  For 
each  i,  define  a  hermitian  2x2  matrix  Hi  by 


Then  there  exists  a  vector  d  €  R?  and  a  matrix  V  G  R?x3  such  that 

f h  ( P 2)  =  {d  +  Vu  :  u  G  52}  . 

where  ftj  is  defined  in  (7.7). 

Remark:  In  other  words,  the  image  of  P2  by  fn  is  the  image  of  an  affine  linear  map  on 
the  unit  disk  in  R3. 

Proof:  First,  we  parametrize  the  unit  ball  in  C2  as 

_  eJ"  cos  0 
7  ~  e-7^  sin  6 

for  some  real  and  9.  As  it  turns  out,  only  on  the  difference  u>  —  is  important, 
and  we  denote  this  as  <f> . 


lLK*  -jtViVtr  L»  i.*  MM  j,*  1‘t>  k  VlVljM .  i 


Then,  for  any  one  of  the  particular  Hi,  and  for  any  rj  G  C2,  with  ||r?||  =  1  we  have 


.  L r  [e  cos  9  e  **  sin  5l  [a  b  eJW  cos  9 

’>H’>  =  'be  «>*sin0 


=  a  cos25  +  csin25  +  2  [Re(6)  cos  <t>  +  Im  (b)  sin  4>]  cos  9  sin  9 


_  o±£  .).  cos  25  +  2  [Re(6)  cos  <j>  +  Im(6)  sin  <f>]  cos  9  sin  9 


_  ++  Is?  Re(i)  Im(4)j 


cos  25 

2  cos  cos  5  sin  5 
2  sin  <f>  cos  5  sin  5 


Note  that  the  vector 


cos  25 

2  cos  <f>  cos  5  sin  5 
2  sin  <f>  cos  5  sin  5 


is  a  parametrization  of  S2.  Hence  setting  d<  :=  and  the  z’th  row  of  V ,  vx,  to 


f,  :=  [  ^  Reft)  Imft) 


proves  the  lemma,  f 


Lemma  7.3  Let  d  €  R2  and  V  G  R2x3.  Then  the  set  G^v  :=  |d  +  Vu  :  u  G  52j  is 


convex. 


Proof:  Let  U\,u2  6  S 2  and  let  A  G  [0, 1].  Obviously 

A  (d  +  yUl)  +  (1  -  A)  (d  +  Vu2)  =  d  +  ^  (Auj  +  (1  -  A)u2)  . 

Now  ||Ati!  +  (1  —  A)u2||  <  1.  If  it  is  equal  to  1,  we  are  done.  Otherwise,  we  can  add 

to  it  a  vector  tv  in  the  null  space  of  V  (note  because  of  the  dimensions,  V  always  has 
a  nontrivial  nullspace)  so  that  u3  :=  Aui  +  (1  —  A)u2  +  w  G  S2.  Then 

A  {d  +  V Ui)  +  (I  —  A)  (d  +  V u2^  =  d  +  V u3  G  Gd,v-§ 

Hence,  for  q  =  2  and  r  =  2,  the  set  /  ( P 2)  G  R2  is  convex.  For  a  block  structure  with 

s  =  0,  /  =  3,  the  set  V  is  always  of  the  form  /  ( Pr )  G  R2  (ie.  q  =  2).  Recall  though,  that 

in  our  application,  r  is  the  multiplicity  of  the  maximum  singular  value.  Concievably,  this 
can  be  anything,  hence  we  need  to  generalize  the  above  reasoning  for  r  >  2.  This  is  easy. 
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Lemma  7.4  Let  r  be  any  positive  integer.  Let  Hi,H2  €  CrXr  be  hermitian  matrices. 
Then  the  set 

Sh  (P')  =  { [  ||,||  =  l}  (7.9) 

is  convex. 

Proof:  Let  771,7/2  be  unit  vectors  in  Cr  and  let  A  €  [0,1].  With  Sh  defined  in  (7.9),  we 
need  to  find  a  773  €  Pr  such  that 

Sh  (^3)  =  A/ n  (t?i)  +  (1  -  A )fn  (772) . 

Without  loss  of  generality,  suppose  77 1  7^  772.  Choose  orthogonal  vectors  x,y  €  PT 
that  spam  the  same  two-dimensional  subspace  as  that  spanned  by  T]l  and  rj2.  Define 
two  hermitian  matrices  H\  and  H2  £  C2x2  by 

Hi  :=  Hi  [ x  y] . 

Using  these  two  matrices,  and  the  definition  of  /  in  (7.7),  we  can  naturally  define 
a  function  Sh'-  P2-+R2.  From  Lemma  7.3,  we  know  that  the  set  Sfj  (P2)  is  convex. 
Since  x  and  y  are  orthogonal,  the  matrix  [x y]  6  Crx2  is  unitary,  and  there  are 
vectors  Ci,  C2  €  P2  such  that  r\i  =  [x  y]  (,•  for  each  i  =  1,2.  Therefore,  for  each  i, 
}h  (77.)  =  Sh  (C*)-  Now  by  convexity  of  Sh  (^2)»  there  is  a  (3  6  P2  such  that 

a/h(Ci)  +  (i-a)4(C2)  =  /h(C3) 

Let  773  G  PT  be  defined  by  773  :=  [x  y]  (3  Note  that  /#  (773)  =  Sh  {(3)-  Therefore, 
A  Sh  (t?i)  +  (1  —  A  )Sh  (772)  =  Sh  (*73))  so  that  Sh  (Pt)  is  indeed  convex  as  claimed.# 

7.1.4  Summary  for  block  structures  with  s  =  0 

The  last  three  sections  have  shown  the  well  known  results  for  block  structures  with  only 
full  blocks.  These  results  were  alluded  to  in  the  top  row  of  the  table  from  section  3.1.  As 
we  noted  in  section  7.1.2,  the  counterexample  for  4  full  blocks  is  also  a  counterexample  for 
other  block  structures,  since  the  full  blocks  in  the  example  were  lxl  and  could  be  viewed 
a  repeated  scalar  blocks  as  well. 

It  is  not  know  what  the  worst  ratio  of  p.  over  the  upper  bound  can  be.  The  4  block 
counterexample  in  this  section  has  a  ratio  of  approximately  .874.  Extensive  computational 
experience  has  failed  to  reveal  another  example  which  is  worse,  even  for  much  higher 
number  of  blocks.  There  has  not  yet  been  a  physically  motivated  example  where  the  ratio 
was  more  than  .98. 
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The  situation  when  there  axe  also  repeated  scalar  blocks,  s  0,  has  not  been  been  studied 
as  extensively.  One  of  these  structures  is  the  topic  of  the  next  section. 


7.2  Block  structures  with  s  ^  0 

As  we  saw  in  the  last  section,  when  s  =  0  and  /  <  3  (3  or  less  full  blocks),  the  set  Vm  is 
itself  convex.  Therefore  for  that  block  structure,  p{M)  =  ^nf^o-  (eD .  In  addition, 
there  also  exist  4  block  examples  where  0  £  co  ( V)  but  0  £  V.  Of  course,  by  the  previous 
results,  inf  a  in  those  instances. 

Until  now,  the  case  of  repeated  scalar  ( s  ^  0)  blocks  has  not  been  investigated.  In  section 
6.1.1,  we  defined  the  correct  Vm  set  to  obtain  descent  directions  for  a  (tDMe~D^j  when 
repeated  scalar  blocks  are  part  of  the  block  structure.  Then  in  section  6.2,  we  showed 
that  0  6  Vjvf  if  and  only  if  p  ( M )  =  a  (M),  a  result  previously  known  for  the  case  of  all 
full  blocks.  In  this  section,  we  continue  with  structures  having  repeated  scalar  blocks,  in 
particular,  we  consider  a  block  structure  of  one  repeated  scalar  block,  and  one  full  block. 
Recall  the  definition  of  Vm,  equation  (6.14).  With  this  structure,  the  set  Vm  will  always 
be  of  the  form 

V  =  {Atjtj*A*  -  Br\rfB*  :  n  6  Cr,  ||q||  =  1 }  (7.10) 

for  some  given  r  >  0  and  A,  2?  £  CTlXr.  It  is  easy  to  see  that  in  general,  V  is  not  convex. 
For  instance,  take  A  =  /  and  B  =  0.  Then  V  is  all  norm  1  dyads,  but  in  general,  a  convex 
combination  of  norm  1  dyads  is  not  a  norm  1  dyad,  so  V  is  not  convex.  However  the 
following  (which  is  all  we  need)  is  always  true. 

Theorem  7.5  Let  V  be  defined  as  in  (7.10).  If  0  £  co(V),  then  0  £  V. 

Proof:  Suppose  that  0  £  co(V).  Then,  for  some  integer  p,  there  exist  nonnegative 

p 

a,-,  i  —  1,2, ...  ,p  with  a*  =  1  and  vectors  77,-,  i  =  1,2, ...  ,p  with  ||?7,-||  =  1 


such  that 


which  is  rewritten  as 


E  ai  (A?7ip,*A*  —  Britf’  B*)  =  0 


A  ( E  QiViVim )  A‘  =  B  (  E  Q«7?>7?«* )  B 


(7.11) 


(7.12) 


Since  the  cq  are  nonnegative,  and  not  all  0,  the  dyad  summation  in  (7.12)  is  a  positive 
semidefinite  matrix  that  is  not  zero.  Let  be  its  hermitian,  positive  semidefinite 
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square  root.  Therefore 


AX*X*A*  =  BX*X*Br 


Hence,  there  is  a  unitary  matrix  V  such  that 

AX *  = 


(7.13) 


(7.14) 


Let  v  be  an  eigenvector  of  V  (with  eigenvalue  eJ*  )  such  that  X^v  ^  0,  and  define 
u  :=  X^v.  Note  that  u  is  nonzero.  This  gives 


Au  —  e*e  Bu 


(7.15) 


which  implies  that  0  €  V.  jj 


What  implication  does  this  have?  Obviously  for  this  structure,  fx  ( M )  =  inf  a  (DMD~x^j. 
Precisely,  let  M  be  a  given  matrix,  partitioned  as 


jur  _  [  Mi  M2 
M  ~  l  Mn  M2 


and  suppose  the  dimensions  are  Mn  6  CnXn  and  M2 2  6  CmjXmi.  Define  A  as 


Then 


A  =  {diag  [$,/»  ,  A2]  :  6X  6  C ,  A2  6  C"“Xm’} 


/  \r \  •  r  -  f  DMnD  1  DM12 

/xA  (Af)  =  inf  <7  M  n— 1  M 

DgC"’1"  l  MilLt  J«22 


oeC"*’1 

JD  invertible 
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8  Linear  Fractional  Transformations 


8.1  Introduction 


Using  only  the  definition  of  fi,  we  can  prove  some  rather  simple  theorems  about  a  class  of 
general  linear  feedback  loops  called  Linear  Fractional  Transformations.  To  introduce 
these,  consider  a  complex  matrix  M  partitioned  as 


Mn  Mu 

M2\  A/22 


and  suppose  there  is  a  defined  block  structure  A  which  is  compatible  in  size  with  Mu- 
For  A  €  A,  consider  the  following  loop  equations, 


e  =  M2\W  +  M22d 
z  =  Mnw  +  Mud 
w  =  A  z 


This  set  of  equations  (8.2)  is  called  well  posed  if  for  any  vector  d,  there  exist  unique 
vectors  w,  z,  and  e  satisfying  the  loop  equations.  It  is  easy  to  see  that  the  set  of  equations 
is  well  posed  if  and  only  if  the  inverse  of  I  —  Mn  A  exists.  If  not,  then  depending  on  d 
and  M,  there  is  either  no  solution  to  the  loop  equations,  or  there  are  an  infinite  number 
of  solutions.  When  the  inverse  does  indeed  exist,  we  have  e  =  FU(M,  A) d  where 


Fu  ( M ,  A)  =  M22  +  A/21A(/  -  MnA)~1M12 


Fu  (M,  A)  is  called  a  Linear  Fractional  Transformation  on  M  by  A,  and  in  a  feedback 
diagram  appears  as:  _ 


Figure  8.1  Linear  Fractional  Transformation 


From  a  system  point  of  view,  we  interpret  vector  d  as  the  “disturbance”,  and  e  is  the 
“error”,  whereas  vectors  z  and  w  are  internal  variables.  M22  is  the  nominal  map  between 
the  disturbance  and  error,  and  A  represents  unknown  quantities,  called  perturbations, 
which  affect  the  map  in  a  known  way-namely  through  Mi2,  M21,  Mu,  and  the  formula  Fu. 


The  subscript  u  on  Fu  pertains  to  the  “upper”  loop  of  M  is  closed  by  A.  An  analogous 
formula  describes  Fj  (A/,  A),  which  is  the  resulting  matrix  obtained  by  closing  the  “lower” 
loop  of  M  (assuming  the  dimensions  are  ok  and  the  implied  inverse  exists). 


The  constant  matrix  problem  that  we  would  like  to  solve  is: 

•  determine  whether  the  LFT  is  well  posed  for  all  A  in  some  prescribed  subset  fl  C  A 
and, 

•  if  so,  then  determine  how  “large”  Fu  ( M ,  A)  can  get  for  A  in  Cl. 

The  next  section  has  three  simple  theorems  which  answer  this  problem. 


8.2  Well  Posedness  and  Performance  for  Constant  LFT’s 

One  appealing  use  of  p  is  to  determine  the  well  posedness  of  a  linear  fractional  transforma¬ 
tion  on  a  structured  A,  and  to  determine  how  “big”  the  linear  fractional  transformation 
can  get.  As  we  will  see,  p  answers  these  questions.  Of  course,  using  the  results  here  will 
require  that  we  can  compute  fi. 

Consider  a  complex  matrix  M  partitioned  as 

h/f  A/n  M12  1  (Q  . 

M=[  M„  AfeJ  (8-4) 

and  suppose  there  are  two  defined  block  structures  Ai  and  A2  which  are  compatible  in 
size  with  Mn  and  M22  respectively.  Define  a  third  structure  A  as 


=  {[t 


:  Ai  6  Aj,  A2  €  A2 


Now  we  have  three  structures  with  which  we  may  compute  fi  with  respect  to.  The  notation 
we  will  use  to  keep  track  of  this  is  as  follows:  n\  (')  is  with  respect  to  At,  fi2  (•)  is  with 
respect  to  A2,  :  fi i,2(-)  is  with  respect  to  A.  In  view  of  this,  fi\(M\\),  /i2(M22)  and 
fill  ( M )  all  make  sense,  though  for  instance,  Hi  (M)  does  not. 

The  first  theorem  addresses  the  well  posedness  of  the  LFT  FU{M,  Ai),  and  is  nothing 
more  than  a  restatement  of  the  definition  of  fi. 

Theorem  8.1  Let  0  >  0.  The  LFT  is  well  posed  for  all  Ai  €  jBA  if  and  only  if 

fix  (Mu)  <  0. 

Note  that  the  <  and  <  signs  can  be  exchanged  and  the  theorem  is  still  true.  An  imprecise 
but  important  notion  to  get  from  this  is  that  the  minimum  amount  of  structured  feedback 
necessary  to  cause  a  loop  to  be  ill  posed  is  inversely  proportional  to  fi  of  the  open  loop. 


As  the  “perturbation”  Ax  deviates  from  zero,  the  matrix  relating  d  to  e  deviates  from  A/22. 
Using  the  quantity  hi,2  (A/),  we  can  bookkeep  what  happens  to  fi2  ( Fu  ( M ,  Ax))  as  follows: 

Theorem  8.2  (Robust  Performance:constant)  Let  0  >  0.  Then  hi,2  (M)  <  0  if  and 
only  if  Hi  (Mu)  <  0,  and  for  all  Ai  G  BA1(  /i2  (Fu  (A/,  Ax))  <  0. 

Proof: 

«—  Let  A;  6  Ai  be  given,  with  <7  (A;)  <  and  define  A  =  diag  [Ai,A2].  Obviously 
A  €  A.  Now 


det  (7  —  A/A)  =  det 


I  —  A/xxAx  — A/j2A2 
— A/21Ai  I  —  A/22A2 


By  hypothesis  /  —  A/nAi  is  invertible,  hence  det  (/  —  A/A)  becomes 

det  (/  -  MnAi)  det  (/  -  A/22A2  -  A/21Ax  (/  -  A/„ Ax)_1  A/12A2) 

Collecting  the  A2  terms  leaves 

det  (/  -  A/A)  =  det  (7  -  MnAi)  det  (7  -  Fu  (A/,  Ax)A2) 

We  also  have  /i2  (Ftt  (A/,  Ai))  <  0 ,  so,  since  «r  ( A2)  <  the  quantity  I  —  Fu  ( M ,  Ax) A2 
must  be  nonsingular.  Therefore  7  —  A/A  is  nonsingular,  so  Hi, 2  (A/)  <  0. 

Basically,  you  just  reverse  the  argument  above,  but  we  include  this  for  complete¬ 
ness.  Again  let  Ai  G  Ax  and  A2  G  A2  be  given,  with  <r(A;)  <  and  define 
A  =  diag  [Ax,  A2].  By  hypothesis,  we  know  that  /  —  A/A  is  nonsingular.  It  is  easy 
to  verify  from  the  definition  of  h  that  (always) 

>  max{/rx(A/xx)  ,  M A/22)} 

so  we  also  have  Hi  (Mil)  <  0i  which  gives  that  7  —  M\\A\  is  nonsingular  too. 
Therefore 

det  (7  —  A/xxAx)  det  (7  —  Fu  (A/,  Ax)A2)  =  det  (7  —  A/A)  ^  0 
Obviously,  I  —  Fu  (A/,  Ai)A2  is  nonsingular.fl 


An  identical  proof  switches  the  <  and  <  signs: 


Theorem  8.3  Let  0  >  0.  Then  ^x,2(A/)  <  0  if  and  only  if  Hi  (Afu)  <  0,  and  for  all 
Ax  G  Ax,  with  d-(Ai)  <  £  /i2  (Fu  (A/,  Ai))  <  0. 


Wl 


UlkM’U,Llti 


;3P 


Roughly  speaking,  we  have  a  test  that  determines  if  for  all  ct(Ai)  <  L,  the  quantity 
/<2  (Fu  (M,  Aj))  stays  bounded  by  0.  Since  both  p(-)  and  a(-)  are  special  cases  of  p, 
by  the  appropriate  choice  of  the  set  A2,  either  p  (Fu  ( M ,  Ai))  or  a-(Fu  ( M ,  Ai))  could  be 
“watched”.  Of  course  for  different  choices  of  A2,  the  theorem  gives  information  about 
p2(Fu(M,  AO). 

Note  that  in  this  test,  the  bound  we  get  on  the  performance  is  dependent  on  the  bound  we 
set  on  the  perturbation,  namely  they  are  reciprocals.  For  other  values,  we  must  scale  M 
and  recompute.  Specifically,  for  a  >  0,  define  MQ  as 

Ma  =  M\l  M'2  (8.7) 

QfJVjji  QfA/22 

Some  simple  facts  about  Ma: 


•  If  a  =  0  then  /iXj2  (MQ)  =  pi  (Mxx) 

•  For  any  Ax  €  Ax,  Fu  (Ma,  Ax)  =  aFu  ( M ,  Ax)  (as  long  as  the  inverse  exists) 

•  max  {px  (Mu)  ,  a/i2(M22)}  <  p1>2(Ma)  <  max  {1,  a}  pli2(M) 

•  pli2  (Ma)  is  a  continuous,  nondecreasing  function  of  a 


Let  7  >  pi  (Mn)  be  given,  and  define 


07  =  max  {a  :  ph2  (Ma)  =  7} 

Or>0 


These  lead  to  the  following  variant  of  Theorems  8.2  and  8.3; 

Theorem  8.4  (Worst  Case:constant)  Let  7  >  p.\  (Mu)  be  given,  and  a7  be  computed 
from  (8.8).  Then 

sup  p2  (Fu  (M,  Aj))  =  —  (8.9) 

Aie^BAi  aT ' 

Remark:  The  basic  idea  of  the  theorem  is  this:  find  the  largest  a  such  that  for  all 
Aj  €  }BA,  we  still  get  p2  (Fu  (M,  Aj))  <  |- .  By  the  2nd  fact  above,  this  is  the 
same  as:  find  the  largest  a  such  that  for  all  <j(Ax)  <  i,  p2(Ma)  <  0-  This  test 
we  can  do,  by  applying  Theorem  8.2  on  M0,  which  then  gives  the  result. 
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Proof:  Since  7  >  pi  (Mu),  the  left  hand  side  of  (8.9)  is  always  well  defined.  By  definition 
of  07,  we  know  that  /xli2  =  7,  and  for  any  e  >  0,  pXy2  (Ma7+t)  >  7.  Applying 

Theorem  8.3  with  0  =  7  gives 

if  Aj  €  Ai,<x(Ai)  <  —  ,  then  /*2  (Fu  (M,  A^)  <  — 

7 

Since  Fu  ( M ,  Ai)  is  well  defined  and  continuous  on  |Ai  :  a  (Ax)  <  we  have 

sup  p2(Fu(M,  Ax))  < — 

Ai€^BAi  al 

Suppose  it  is  truly  less.  Then  for  some  e  >  0 


sup  p2(Fu(M,Ai))  = 


o-v  +  e 


which  implies  that  /ilj2  <  7  ,  a  contradiction  of  the  definition  of  a-,,  jj 

Corollary  8.5  If  /i1|2  (Mtt)  is  a  increasing  function  of  a  (not  just  nondecreasing)  then 


sup  H2 


{FU{M,  Ai))  = 


Finally,  we  state  a  maximum  modulus  like  result  for  /*.  The  proof  uses  Theorem  4.5  from 
the  previous  section,  along  with  ideas  similar  to  the  ones  here. 

Theorem  8.6  (Maximum  modulus:  LFT)  Let  M  be  given  as  in  (8.4),  along  with  two 
block  structures  and  A2.  Suppose  that  px  (Mu)  <  1.  Then 


max  P2  (F u  (M,  At))  =  max  p2  (Fu  (M,  Qx)) 

A16BA1  Qi€Ci 


(8.10) 


Remark:  In  light  of  this,  any  p  test  with  at  least  one  repeated  scalar  block  can  always  be 
reduced  to  a  one  dimensional  search  of  p  tests  without  that  block.  A  similar  result 
to  Theorem  8.6  is  in  [BoyD].  They  show  for  that  any  H  bounded  and  analytic  on 
|z|  <  1,  the  function  k(z)  :=  p  ( H(z ))  is  subharmonic. 

Finally,  we  note  that  Theorems  8.1  through  8.4,  along  with  corollary  8.5  and  Theorem  8.6 
have  obvious  analogs  dealing  with  the  behavior  of  F;  (M,  A),  under  structured  perturba¬ 
tions.  In  this  section,  all  of  the  results  were  stated  and  proven  for  Fu  (A/,  A).  Throughout 
this  tht,3is,  we  will  use  the  result  of  either  type  without  special  mention. 


8.3  Examples  of  LFT’s 


8.3.1  Transfer  functions  as  LFT’s 


Consider  a  stable,  discrete  time,  linear  system 


xk+1  =  Axk  +  Buk 
yk  =  Cxk  +  Duk 


(8.11) 


with  transfer  function  G(z )  =  D+C  ( zl  —  A)-1  B  (n  states,  and  for  simplicity,  we  assume 
that  this  has  m  inputs  and  outputs,  though  everything  that  follows  holds  for  nonsquare 
plants  also).  The  infinity  norm  of  G  is  defined  as 


which  is  equivalent  to 


Halloo  =  sup  *(£(*)) 
zee 
M>i 


||G||co  =  sup  a  (D  +  6C(I  -  6A)-1B) 
sec 
\s\<i 


(8.12) 


Define  Ax  =  {SIn  :  6  G  C>,  A2  =  CmXm  and 


A  B 
C  D 


G  R,(n+m)x(n+m) 


In  n  notation,  we  can  write  (8.12)  as 


||G||oo  =  sup  h2(F»(M,  Ai)) 

AiSBAi 


(8.13) 


(8.14) 


because  the  block  structure  A2  implies  that  =  ^(O?  an<I  Ax  has  been  defined  to 

represent  the  2-transform  variable.  Applying  theorem  8.2,  with  j3  =  1,  gives 


||G||oo<l  iff  l*i  AM)  <  1. 


(8.15) 


In  view  of  the  result  in  section  7.2,  actually  [|G'|j00  <  1  if  and  only  if  there  exists  a 
coordinate  transformation  T  G  Cnxn  such  that 


(\  TAT~l  TB  1\ 
\[  CT'1  D  \) 


Hence,  we  have  an  algorithm  for  generating  all  stable  rational  transfer  functions  that  have 
||  •  Ho,,  <  1.  Simply  choose  any  matrix  M  so  that  a{M)  <  1  and  partition  M  as  shown 
above.  Then  G  will  be  stable,  and  have  norm  less  than  one,  and  all  stable  rational  G(z), 
with  ||G||oo  <  1  can  be  generated  in  this  fashion. 


vwu 


Jr  iif  M  i 


* ».«  i.i’M (i.ru '*.< 


u|  «||  |.|  f 


This  result  can  also  be  shown  using  results  from  dissipative  systems,  and  linear  quadratic 
optimal  control  theory  (with  nondefinite  cost  functions).  In  fact,  if  ||G||oo  <  1,  then  solving 
one  Riccati  equation  yields  a  T  G  Cnx”  such  that 


(\  TAT-i  TB  1\ 

{[  CT~l  D  \j  ~  L 


The  details  of  this  calculation  are  interesting,  and  follow  straightforwardly  from  the  results 
in  [Wil].  We  do  not  include  them  here  because  the  Riccati  solution  has  the  undesirable 
property  that  n  of  the  singular  values  will  be  coalesced  at  a  =  1.  This  seems  to  limit 
the  usefulness  of  the  Riccati  solution  as  a  viable  computational  alternative  to  gradient 
searching  along  the  “full”  D  directions. 

In  this  example,  the  “perturbation”  is  the  repeated  scalar  block,  and  for  the  ||  •  Hoc  norm,  it 
must  correspond  to  the  unit  disk.  Using  theorem  8.2  with  /3  equal  to  1,  we  can  only  check 
if  ||Gi|oo  is  less  than  1.  For  other  values,  we  must  scale  G  and  recompute,  using  Theorem 
8.4.  Namely,  define  a  as 


a  =  max 

a>0 


Then  the  worst  case  theorem,  Theorem  8.4  (with  7  =  1)  gives 

l|G||„  =  i 

a 


f  .  ,  TAT-' 

TB  ‘ 

\  ) 

aD 

M 

(8.16) 
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hfl 

(8.17) 


8.3.2  Keeping  LFT’s  large 

Just  as  p  can  be  used  to  determine  how  big  the  maximum  singular  value  (or  spectral 
radius)  of  an  LFT  can  get,  we  can  also  use  it  to  determine  if  the  minimum  singular  value 
will  remain  bounded  away  from  0  (and,  of  course,  the  minimum  eigenvalue  too).  Of  course, 
the  motivation  of  the  LFT  as  a  “perturbed  disturbance  to  error”  is  no  longer  applicable, 
but  this  problem  is  interesting  in  its  own  right.  The  key  to  all  this  is  that  the  inverse  of 
an  LFT,  F/  (Af,  A)  ,  is  itself  an  LFT,  on  the  same  A,  but  with  a  different  known  matrix, 
Mj. 

This  section  will  present  these  types  of  results.  All  are  obtained  from  the  well  known 
“matrix  inversion  lemma”,  which  we  review  for  completeness.  We  begin  with  a  lemma 
that  is  fundamental  to  the  matrix  inversion  lemma. 

Lemma  8.7  Let  A,B,C,  and  D  be  complex  matrices,  A  €  C nxn,f?  6  CnXm,C  6 
C™1"",!)  £  CmXn.  Suppose  that  A  and  C  are  each  invertible.  Then  A  4-  BCD  is 
invertible  if  and  only  if  C~l  +  DA~lB  is  invertible. 


fflrwws c**’ 


ms 


Proof:  Taking  determinants,  we  get 

det  (A  A  BCD)  =  det  A  det  (7  +  A-1  BCD) 

=  det  A  det  (I  +  DA~XBC) 

=  det  A  det  (C-1  -f  DA~lB)  det  C.  j) 

In  order  to  evaluate  how  small  things  actually  get,  we  need  the  matrix  inversion  lemma. 

Lemma  8.8  Suppose  A,  B,  C,  and  D  are  given  as  in  lemma  5.6.  If  A,  C,  and  A  +  BCD 
axe  invertible,  then 

(A  +  BCD)-1  =  A"1  -  A~lB  ( C _1  +  DA^b)'1  DA'1 

Proof:  By  lemma  8.7,  C-1  +  DA~lB  is  invertible  -  the  result  follows  by  verification,  j) 

Now,  let  M  be  given,  partitioned  in  a  2  x  2  fashion  as  in  (8.4),  and  let  A2  be  a  given 
structure,  compatible  with  M2 2.  Suppose  M\\  is  square,  hence  F[(M,  A)  is  square  too. 
Under  what  conditions  is  Fi  ( M ,  A2)  invertible  for  all  A2  £  A2,  with  o’ (A)  < 

First,  we  require  that  it  be  well  defined  for  all  such  A2,  so  we  need  p2  (M22)  <  /?.  This 
guarantees  that  I  +  M22A2  will  be  invertible.  Second,  it  is  obvious  that  Mn  needs  to  be 
invertible,  otherwise  the  LFT  is  not  invertible  even  for  A2  =  0. 

Theorem  8.9  Let  M  be  given,  with  the  following  assumptions:  Mn  is  square  and  invert¬ 
ible,  and  p2  (M22)  <  (3.  Then  for  all  A2  €  ^BA2  the  LFT  Fi  (M,  A2)  is  invertible  if  and 
only  if  pi  (m22  —  MuMf1  Mi2j  <  (3. 

Proof:  Since  Mu  is  invertible,  and  p2(M22)  <  (3,  and  <t(A2)  <  we  can  apply  Lemma 
8.7  to  determine  the  invertibility  of 


A/ii  "H  A/i2A2  ( I  —  A/22A2)  1  A/2i  ■ 

A  B  CD 

This  is  invertible  if  and  only  if 

I  —  A/22A2  +  A/2i  A/jj1  A/i2A2  (8.18) 


is  invertible.  Recall  the  definition  of  p.  The  quantity  in  (8.18)  is  invertible  for  all 
A2  6  A2  with  <t(A2)  <  L  if  and  only  if  p2  (M22  —  A/^A/fi1  A/i2)  <  /?.U 


: 
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Now  apply  the  matrix  inversion  lemma  to  get  an  expression  for  [F;  (M,  A2)]-1 .  If  Ft  ( M,  A2) 
is  invertible,  then 

^(M.A,)]-1  =  MJ  -  A/f11M12A2  [/  -  (M22  -  M21MJM12)  A,]"1  MnM£ 


If  we  define  a  matrix  M/  as 


Mnl_  -MrfMu 
MnMn1  M22 —  Af2iA/111A/i2 


then 

[Ft  (M,  A2)]-1  =  Ft  (Mi,  A2) 


Theorem  8.10  Suppose  that  Mu  is  invertible,  p2  (M22  —  M2iM{[1  <  0,  and  p2  (M22)  < 

0.  Then,  in  view  of  the  discussion,  the  following  equivalences  maice  sense  and  are  true: 

min  o[Fi{M,A2)\  >  \  max  d(Ft  (M/,  A2))  <  0 

Aj€^BA2  0  A2€^BA2 

♦-»  /*  a  ( MI )  <  P 

where  A  :=  {diag[A,  A2]  :  A  €  Cnx",  A2  €  A2}.  (If  we  had  wanted  to  keep  track  of  the 
smallest  magnitude  eigenvalue,  as  opposed  to  the  smallest  singular  value,  then  the  top 
block  of  A  would  instead  be  a  repeated  scalar  block)  jf 


8.4  Upper  bound  LFT  results 

Each  of  the  Theorems  8.2  and  8.3  give  necessary  and  sufficient  conditions  for  some  per¬ 
formance/robustness  characteristic  in  terms  of  a  p  evaluation.  Looking  back  at  these 
theorems,  we  see  that  the  p  test  always  looks  like  “Is  p  ( M )  <  0V'  (or  <).  Hence,  upper 
and  lower  bounds  can  be  used  in  the  following  manner: 

•  an  upper  bound  gives  a  sufficient  condition  for  the  robustness/performance  charac¬ 
teristic  of  the  theorem 

•  a  lower  bound  gives  a  sufficient  condition  when  the  robustness/ performance  will  not 
be  met 

Consequently,  both  are  important.  The  upper  bound  will  yield  positive  comments  like 
“We  are  okay  for  perturbations  up  to  this  size,  and  maybe  alot  better”,  while  the  lower 
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bound  yields  negative  statements  such  as  “There  is  a  perturbation  this  size  that  will  cause 
instability  (or  sufficient  degradation  in  performance),  and  it  might  be  worse”. 

The  above  comments  apply  for  any  upper  and  lower  bound.  In  this  section,  we  will 
concentrate  on  the  additional  information  that  is  obtained  in  using  the  d  (DMD~l)  upper 
bound.  In  other  words,  because  of  its  structure,  d(DMD~ 1)  <  0  in  general  implies  a 
great  deal  more  than  p  ( M )  <  0.  One  word  before  proceeding:  we  drop  the  exponential 
notation  for  D’s,  and  revert  back  to  (3.7)  for  the  notation.  Recall  that  the  exponential 
parametrization  was  introduced  in  section  5  to  allow  simpler  derivative  formulas,  which 
are  implicit  in  the  definition  of  Vjv/- 

We  begin  with  an  obvious  result:  Trivial  upper  bound  lemma:  Let  A1  and  A2  be 
two  given  structures,  and  define  a  third,  A  =  {diag  [Ai,  A2]  :  Ai  €  A,}.  Let  M  be  a  given 
matrix  such  that  p ii2  (M)  makes  sense,  and  suppose  there  is  a  function  put,  (•)  that  is  a 
upper  bound  for  pi,2  (•).  If  puk  ( M )  <  0,  then 

max  /i2  (Fu  (M,  Aj))  <  0 
a1g^BA1 

Proof:  This  follows  directly  from  the  constant  matrix  robust  performance  theorem, 
Theorem  8.2.U 

The  following  theorem  shows  what  additional  information  we  get  if  the  upper  bound, 
^uj(-),  is  in  fact  the  d{DMD~l)  upper  bound.  As  before,  let  Aj  and  A2  be  two  given 
structures,  and  let  A  =  {diag[A1(  A2]  :  A,-  €  A,}.  Similarly,  let  be  the  appropriate  D 
scaling  sets  for  the  two  structures,  and  denote  V  as  the  obvious  diagonal  augmentation  of 
these  two  sets. 

Lemma  8.11  (Constant  D  lemma)  Let  M  be  given  as  in  the  robust  performance 
theorem,  8.2.  Suppose  there  is  a  D  G  T>  such  that 

a(DMD~l)  <  0 

Then  there  exists  a  Z?2  €  P2  such  that 

max  d  (Z?2Fu  (M,  <  0 

Remarks:  Initially,  one  might  guess  that  if  we  replace  p  by  the  d{DMD~x)  upper  bound 
in  the  robust  performance  theorem  hypothesis,  the  resulting  claim  would  just  have  p  re¬ 
placed  by  d(DMD~l).  This  lemma  shows  that  we  get  quite  abit  more:  If  the  d  ( DMD~l ) 


upper  bound  is  less  than  0,  this  does  not  just  imply  that  for  all  Ai  G  Ax,  with  cr  (Ai)  < 
the  upper  bound  of  Fu  (M,  Ai)  is  less  than  0.  It  implies,  instead,  that  this  is  indeed  so, 
but  using  only  a  single  D2  6^2- 


Proof:  The  easiest  method  of  proof  is  just  to  track  the  norms  of  the  various  vectors  in 
the  loop  equations  for  the  LFT.  Let  D\  and  £>2  be  the  separate  parts  of  the  D  eV 
that  achieves  a  ( DMD~ <  0.  Obviously,  fix  ( Mxx )  <  0,  so  for  any  Ai  G  Ai  with 
cr  (Ax)  <  the  two  LFT’s  below  are  well  posed,  and  from  d  to  e  are  the  same. 


LFT x  LFT2 

Figure  8.2  Diagram  for  Proof  of  Lemma  8. 1 1 


Let  d  ^  0  be  any  given  complex  vector  of  appropriate  dimension,  and  let  e,  w,  and  z 
be  the  unique  solutions  to  the  loop  equations  for  LFT2.  By  hypothesis,  we  have 


l*ll!  +  M!  <  P  (HI*  +  IMf ) 


and  since  <r(Ai)  <  ^ 
Combining  these  gives  that 


Mi2  <  j2  \\A 


(8.19) 

(8.20) 


Ml2  <  /?2MI2-  (8-21) 

Equation  (8.21)  also  holds  for  LFTX ,  since  the  map  from  d  to  e  is  the  same  for  both 
LFTs.  This  implies  that  a  [p2Fu  ( M ,  Ai)^1)  <  0  as  desired,  jj 


An  interesting  question  is  “what  is  the  optimal  constant  scaling  that  one  can  apply?”  In 
particular,  suppose  (Mn)  <  1.  Therefore,  for  all  Ai  G  Aj,  with  o-(Ai)  <  1,  the  linear 
fractional  transformation  Fu  (M,  Ai)  is  defined.  Can  we  compute  the  value  of 


inf  max  cr  (D2Fu  (M,  Ai)D21) 
a.cra,  V  '  '  1  > 


(8.22) 


AigBA, 

and  also  find  a  D2  that  achieves  it?  Towards  answering  this  question,  we  have  a  simple 
lemma: 
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Lemma  8.12  Let  A/,  AX)  A2,  ©1,  and  Z?2  he  given  as  usual.  Suppose  that  the  A2 
structure  has  dimension  n2  x  n2.  Define  an  augmented  structure  A  as 


A  :=  {diagfAi,  A]  :  Ax  6  A2,  A  G  Cn’Xn2} 


(8.23) 


Note  that  A  is  not  A!  augmented  with  A2.  It  is  Ai  augmented  with  an  unstructured 
block  the  same  size  as  A2.  Suppose  that  pi  (Mn)  <  1.  Then  for  a  >  0,  and  D2  6  V2, 


J 

A  ^  aD2M2i 


M12D2 1 
aD2M22D2 


-1  <  1 


if  and  only  if 


m^jd(D2Fu(M,Ai)D^)<1- 


(8.24) 


(8.25) 


Proof:  Again,  this  follows  directly  from  the  definition  of  the  structure  A,  and  the  robust 
performance  theorem,  Theorem  8.2.jf 

This  allows  easy  proof  of  the  General  optimal  constant  scaling  theorem: 

Theorem  8.13  Let  M,  A1?  A2,  X>1}  V2,  and  A  be  given  as  in  Lemma  8.12.  Suppose 
that  pi  (Mn)  <  1.  Define  7  by 


!  ■  t  (Mn  Mi2D2l  \  ,1 

7  28  \a  ‘  d%/a  {  <xD2M2i  cxD2M22D2 1  )<l\ 

Ab  j&l.  *  (BiF“ (M’  A')D !")  - ; 


(8.26) 


(8.27) 


Proof:  Note  that  since  pi  (Mu)  <  1,  the  value  of  7  (in  (8.26))  is  positive.  Next,  let  r 
denote  the  infimum,  that  is 


r  :=  inf  max  d  (D2Fu(M,Ai)D21) 
D2eP2A,e ba,  v  v  ’  2  / 


We  want  to  show  that  r  =  — . 

7 

Let  a  <  7.  Then,  from  the  definition  of  7,  there  is  a  Z)2  G  I?2  such  that 

(  Mn  Mi2D2 ]  \ 

l  ccD2M2i  ccD2M22D21  j  <  1 


(8.28) 


(8.29) 
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Then  Lemma  8.12  implies 


¥ 


(8.30) 


1 


so  trivially  r  <  — .  This  holds  for  any  a  <  7,  in  particular  for  small  enough  e  >  0 
at 

and  a  :=  7  —  e.  Therefore,  for  e  >  0,  r  <  - ,  and  taking  limits  gives  r  <  — . 

7  ~  e  7 

Suppose  it  is  truly  less,  ie.  r  <  — .  Then  by  the  definition  of  r,  (8.28),  there  is  a 

v  ^ 

£>2  €  £>2  such  that 


max  <t  (d2Fu(M,Ai)D21')  <  — . 


AigBAi 

Then  Lemma  8.12  and  equation  (8.31)  imply  that 


1-1 


Mu  M12D2 
l  ~fD2M21  ~iD2M22DZx 


<  1 


Using  continuity,  for  small  enough  6  >  0,  we  would  then  have 

Mu  M12D2 1 

(S  +  7) D2M2\  ( 6  +  7) b2M22D2l 


Ha 


<  1 


(8.31) 


(8.32) 


(8.33) 


which  violates  the  definition  of  7.  Hence  r  =  1  as  claimed. ){ 


This  is  an  interesting  result.  Note  that  the  structure  which  we  need  to  compute  n  with 
respect  to  does  not  depend  on  A2.  If  cam  be  computed,  then,  modulo  the  necessary 
search  over  the  D2  and  a  this  is  a  useful  theorem.  Later,  in  section  10,  we  will  use  the 
general  optimal  constant  scaling  theorem  to  optimally  scale  transfer  functions  using 
constant,  block  structured  scalings.  This,  along  with  the  small  gain  theorem,  will  provide 
a  method  of  analyzing  linear,  time  invariant,  multivariable  systems  with  structured,  time 
varying  and/or  cone  bounded  nonlinear  perturbations. 
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9  A  Class  of  Uncertain  Difference  Equations 


9.1  Robust  stability 

In  this  section,  we  present  some  robustness  results  for  a  class  of  uncertain  systems.  The  pre¬ 
sentation  centers  around  discrete  time  systems,  as  the  explanation  seems  simpler,  though 
everything  done  here  has  a  continuous  time  analog.  The  required  transformation  for  con¬ 
tinuous  time  systems  is  discussed  in  the  appendix. 

Suppose  M  G  Ch*+m)x(n+m)  is  given,  partitioned  as 

M=\m"  UU  1  (9-1) 

iw2i  A/22  J 

where  Mn  G  C"Xn,M12  €  Cnxm,A/2i  £  CmXn,  and  M22  €  CmXm.  Let  A  be  a  m  x  m 
block  structure,  with  corresponding  D  scaling  set  denoted  by  V.  Suppose  pa  (M22)  <  1. 
Then  for  every  A  G  BA,  the  linear  fractional  transformation  Ft  (M,  A)  is  a  well  defined 
element  of  Cnxn.  Let  xjt  G  Cn  evolve  via  the  (possibly  time  varying)  linear  difference 
equation 

£*+1  =  Fi  (Af,  Afc)xfc  (9.2) 

where  for  each  time  step  k,  Afc  G  BA.  Such  a  system  would  arise  if  a  parametrically 

uncertain  plant,  as  described  in  section  2,  had  a  feedback  controller,  that  stabilized  the 

nominal  system,  and  we  were  interested  in  the  stability  of  the  closed  loop  for  all  the 
possible  perturbed  plants. 

Consider  the  following  three  assumptions  on  the  uncertainty  A*..  For  each  k: 

(a.l)  Afc  G  A 
(a.2)  a  (A*)  <  1 

(a.3)  A*  is  fixed  -  ie.  it  does  not  vary  with  k 

We  want  to  guarantee  the  stability  of  the  system  described  in  (9.2),  knowing  only  these 
three  assumptions. 

Since  (a.3)  implies  that  the  system  is  time  invariant ,  the  stability  of  the  uncertain  system 
amounts  to  nothing  more  than  checking  the  magnitude  of  the  eigenvalues  of  Ft  (M,  A)  for 
each  A  G  BA  and  is  equivalent  to 

max  p(F,(M,A))  <  1 


i. 
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Recall,  from  section  3.1,  that  /)(•),  the  spectral  radius,  is  a  special  case  of  p.  Hence  this 
question  can  be  answered  using  Theorem  8.2,  on  M  with  an  augmented  structure.  The 
augmentation  is  straightforward.  Define 

A  =  {diag  [6In,  A]  :  <5  G  C,  A  €  A}  (9.3) 

For  the  upper  bound,  which  we  will  use  later,  the  corresponding  D  scaling  set  will  be 
denoted  T>  and  is  of  course 

D  =  | diag  [Di,  £>2]  :  D\  €  Cnxn  is  invertible,  Z)2  €  V  j  (9.4) 

Theorem  9.1  The  uncertain  difference  equation  ifc+1  =  Fi(M,  A)xjt  is  exponentially 
stable  for  each  fixed  A  €  BA  if  and  only  if 


Hi(M)  <  1  , 


where  A  is  defined  in  (9.3). 


Proof:  Follows  by  direct  application  of  Theorem  8.2.  jt 

Remember,  this  is  true  for  constant,  but  unknown  A.  If  assumption  (a.3)  above  is  dis¬ 
carded,  then  the  system  is  time  varying.  At  each  step,  the  uncertain  element  may  be 
different-we  only  know  that  at  each  step  k,  it  lies  in  the  norm  bounded,  structured  set 
BA.  Obviously,  simple  spectral  radius  arguments  do  not  apply.  The  next  lemma  gives  a 
simple  sufficient  condition  for  stability. 


Lemma  9.2  If 


max  <t(Fi  ( M ,  A))  =:/?<! 

A€B  a 


then  the  uncertain,  time  varying  difference  equation  (9.2)  is  exponentially  stable,  as  long 
as  A; k  satisfies  assumptions  (a.l)  and  (a.2)  for  each  time  step  k. 

Proof:  Regardless  of  the  time  variation  of  the  perturbation,  A*,  we  get  that  the  norm  of 
Xfc  satisfies 

11**11  <  0*||*o||  (9.7) 

which  obviously  decays  to  zero  exponentially  since  0  <  1  by  assumption,  f 

As  stated,  Lemma  9.2  is  quite  conservative.  We  can  reduce  the  conservatism  by  allowing 
one  state  space  coordinate  change.  The  proof  is  simple,  and  is  omitted. 


I 


V 
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Lemma  9.3  If  there  exists  an  invertible  T  €  CnXn  such  that 


max  d  ( TFi  (A/,  A)r_1)  =  /?  <  1 


then  for  each  fc  tie  state  X*,  of  (9.2)  is  bounded  by 

where  k(T )  denotes  the  condition  number  ofT. 


Remark:  The  above  reasoning  is  equivalent  to  finding  a  single,  quadratic  Lyapunov  func¬ 
tion  for  the  entire  set  of  “A”  matrices 

{F,  (A/,  A)  :  A  6  BA}  . 

This  equivalence  is  evident  via  this  lemma. 

Lemma  9.4  Let  A  6  CnXn  be  given.  There  exists  a  Lyapunov  matrix  P  €  Cnxn. 
P  =  P*,  P  >  0  for  Xk+i  =  Axk  if  and  only  if  there  exists  an  invertible  T  €  CnXn 
such  that  o(TAT~x)  <  1. 

Proof:  P  is  a  Lyapunov  matrix  if  and  only  if  A*  PA  —  P  <  0  .  This  is  equivalent 
to  P~%  A*PAP~*  —  I  <  0  ,  which  is  the  same  as  d  (p*AP~*'j  <  1.  jj 

Consequently,  if  T  is  a  coordinate  transformation  that  solves  (9.8),  then  P  T*T  is 
a  single  Lyapunov  matrix  that  works.  Conversely,  if  P  is  a  correct  Lyapunov  matrix, 
then  P%  is  a  single  coordinate  transformation  which  solves  (9.8). 

Conceptually,  the  existence  of  a  matrix  T  satisfying  condition  (9.8)  can  be  cast  as  a  p  test. 
Again  we  must  augment  the  A  perturbation  structure,  but  this  time  with  a  full  block, 
since  we  are  checking  d  (•),  and  we  must  lug  around  the  coordinate  change  T .  The  new 
structure  A  is 

A  =  {diag  [Ai,  Aa]  :  Ai  €  Cnxn,  A2  €  A}  .  (9.10) 

Now,  using  the  Theorem  8.2,  we  obviously  have 


Theorem  9.5  There  exists  an  invertible  T  €  CnXn  such  that 

max  d(TF,(M,A)T-1)  <  1 
Ae  a  '  ' 

a(  A)<1 


if  and  only  if 


inf  pi 

T€ C"X" 


T  0 

0  Im 


T~ 1  0 

0  Im 


(9.11) 


(9.12) 
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If  we  could  calculate  exactly,  the  condition  (9.12)  would  in  principle  be  something  that 
could  be  checked,  although  it  is  unclear  how  the  search  over  the  T’s  would  be  done.  An 
interesting  approach  we  will  pursue  here  is  to  substitute  the  a  ( DMD~l )  upper  bound  for 
p^(-)  and  see  what  the  resulting  sufficient  condition  is. 


First  we  need  the  correct  set  T>  for  the  structure  A.  If  T>  is  the  appropriate  set  of  D 
scalings  for  A,  then  we  define  V  —  {diag  [d\In,D2]  :  d\  /  0,D2  6  X>}.  Substituting  the 
upper  bound  in  place  of  p^  gives  that  there  is  a  transformation  T  such  that  (9.11)  is  met 
if 


mf  a 

T 

£>ev 


diin 

0 


0 

D, 


T 

0 


0 


M 


T-i 

0 


0 


d^In 

0 


0 

D 71 


<  1. 


(9.13) 


The  scalar  di  is  irrelevent,  since  it  introduces  no  freedom  that  the  coordinate  change  T 
didn’t  already  provide.  Absorbing  d\  into  T,  we  rewrite  (9.13)  as 


mf  a 
T 

Dtev 


T  0 

0  D2 


M 


T-i 

0 


(9.14) 


Note  the  effect  the  transformation  T  has  on  the  minimization  in  (9.14).  Since  T  is  free 
to  be  any  invertible  matrix  in  Cnx",  the  matrix  diag  [T,D2]  is  some  arbitrary  element  of 
V.  Hence  although  (9.14)  is  condition  (9.11)  with  n ^  replaced  by  its  upper  bound,  the 
freedom  in  choosing  the  coordinate  transformation  “alters”  the  upper  bound,  so  that  the 
left  hand  side  of  (9.14)  is  just  the  afDMD-1)  upper  bound  for  the  A  structure  (not  the 
A  structure  that  was  originally  there  in  (9.11)).  In  other  words,  (9.14)  is  just 

inf  <t  (HMD-1)  <  1,  (9.15) 

Dev  '  ' 

and  this  is  a  sufficient  condition  for  Theorem  9.5  to  hold.  We  write  this  as  a  theorem. 


Theorem  9.6  If  there  exists  a  D  G  V  such  that  a  ( DMD  =  0  <  1,  then  the  uncertain, 
time  vcLrying,  linear  system 

Xk+i  =  Fi  (M,  Ak)xk  ,  A*  G  BA  (9.16) 


is  exponentially  stable. 

How  do  all  these  different  conditions  fit  together? 


9. a  Theorem  9.1  showed  that  ( M )  <  1  is  both  necessary  and  sufficient  for  robust 

stability  of  (9.2)  with  constant,  but  unknown  structured  perturbations. 


9.b  Next,  Theorem  9.5  gave  a  necessary  and  sufficient  condition  for  the  existence  of  a 
single,  quadratic  Lyapunov  function  for  the  entire  set  of  systems. 

9.c  Unfortunately,  the  condition  in  Theorem  9.5  is  not  really  a  verifiable  condition,  so 

we  substituted  a  /t  test  with  a  a  (DM D~l)  upper  bound  test.  This  gives  that 

inf  a  ( DMD -1 )  <  1  is  a  sufficient  condition  for  robust  stability  with  unknown, 
Dei>  v  ' 

time  varying,  structured  perturbations. 


Note  the  similarity  between  the  test  in  Theorem  9.1,  and  the  test  in  Theorem  9.6.  Both 
are  associated  with  the  A  structure  -  one  involves  n  and  one  involves  the  a  ( DMD~lSj 
upper  bound.  Yet  the  conclusions  each  give  are  quite  different.  This  sheds  a  little  light  on 
how  fundamentally  different  the  upper  bound  and  pi  are. 

This  final  result  described  in  Theorem  9.6  can  also  be  derived  from  a  different  point  of 
view,  utilizing  Lemma  8.11  from  section  8.4  along  with  the  small  gain  theorem. 


Note  that  the  perturbed  system,  (9.2),  is  just  the  loop  shown  below. 


Figure  9.1  Perturbed  System,  Equation  (9.2) 


Define  the  transfer  function  G{z)  =  M22  +  M21  ( zl  —  Mn)  1  A/12.  If  we  can  find  a  Dx  and 
D2  (in  the  appropriate  scaling  sets,  diag[£>i,  D2]  G  V)  such  that 


a  (DMD-X)  =  <7 


D\M\2D2 1 

D2M2 ^  D2M22D2  * 


<  1 


then  using  the  Constant  D  lemma,  8.11,  we  get 


||Z)2G(^)Z?21||oo  <  1- 


(9.17) 


Now  for  A k  G  A,  the  two  loops  below  are  equivalent,  even  if  A*,  varies  with  fc,  because  D2 
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is  constant,  and  hence  it  commutes  with  linear  time  varying  operators  as  well. 


m 

L  D‘  — . 

1  ‘  1 

G 

— *  A  — 

A, 


Figure  9.2  Equivalent  Loops 

Therefore,  a  trivial  application  of  the  small  gain  theorem  along  with  equation  (9.17)  gives 
that  the  perturbed  loop  is  stable  for  all  varying  a  (A k)  <  1,  as  expected,  and  in  agreement 
with  Theorem  9.6  and  9.c  above. 


9.2  Robust  performance 

We  have  seen  how  the  upper  bound  plays  a  role  in  determining  some  robustness  prop¬ 
erties  of  a  class  of  uncertain  difference  equations  when  the  perturbations  are  structured, 
and  time  varying.  In  this  section,  we  continue  exploring  the  difference  between  /z  and  the 
upper  bound  with  the  added  objective  of  performance.  Performance  will  be  character¬ 
ized  in  terms  of  the  zero  initial  state  l2  gain  from  disturbance  to  error.  Recall  that  for 
time-invariant  systems,  this  is  the  same  as  the  j|  •Hoc  norm  of  the  transfer  function  from 
disturbance  to  error. 

We  begin  with  a  matrix  M  €  C^n+n'+m^x<'l+n<i+m\  partitioned  obviously,  and  relating  the 
variables  via 


Zfc+l 

m12 

M13 

ek 

= 

m21 

M22 

M2  3 

z k 

m31 

M32 

M33 

’  xk  ’ 

dk 

.  Wk  . 

(9.18) 


The  uncertainty  is  “feedback”  from  z  to  w  through  a  structured  A  6  A,  where  A  is 
a  prescribed  m  x  m  block  structure.  Consider  a  uncertain  linear  system  (possibly  time 
varying)  driven  by  a  disturbance  input  dk,  with  output  error  ek. 


Xk+x 

ek 


xk 

dk 


=  F,(M,Ak ) 

With  respect  to  this  partition,  Fi  (M,  A*)  is 

A (I  —  A/33Afc)  1 


(9.19) 


'  Mn 

Mu  ' 

+ 

'  m13  ‘ 

.  M2 1 

A/22 

M23 

M31  A/32 


We  need  two  augmented  block  structures.  Define  A  and  A  as 

A  :=  {diag  [6,  Jn,  A2]  :  5,  G  C,  A2  G  Cn*xn' } 


(9.20) 


<1 
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A  :=  {diag  [A,  A]  :  A  €  A,  A  €  a}  (9.21) 

Suppose  that  p&  (M33)  <  1. 

Theorem  9.7  (Robust  Performance)  For  all  A  €  BA,  the  uncertain  system  (9.19)  is 
stable  and  for  zero  initial  state  response,  the  error  e  satisfies  ||e||2  <  ||d||2  if  and  only  if 

MA  (Af)  <  1. 


Proof: 

4—  Let  A  €  BA  be  given.  Since  /x^  ( M )  <  1,  Theorem  8.2  gives 

Pi(F,(M,  A))<1.  (9.22) 

Stability  is  apparent,  and  the  /2  performance  follows  from  the  example  in  section 
8.3.1. 

— ►  Essentially,  the  steps  are  reversed. 


What  can  be  concluded  if  the  upper  bound  of  ( M )  is  less  than  1? 


Theorem  9.8  Let  M  be  given  as  in  (9.18),  along  with  a  block  structure  A.  If  there  is  a 
D  €  V  such  that 


o  ( DMD -1)  =  0  <  1 

then  for  all  sequences  (Afc}^.0  with  A*  6  BA,  the  time  varying,  uncertain  system 


Xk+l 

e* 


F,(M,  Afc) 


*k 

dk 


(9.23) 


! 

s 

< 

v 

? 


is  zero-input,  exponentially  stable,  and  if  xo  =  0,  and  {dk}  €  /2,  then  ]|e||2  <  0  ||d||2. 

The  results  we  have  obtained  for  time  varying  perturbations  extend  to  a  special  class  of 
nonlinear  perturbations.  The  appropriate  definitions  and  assumptions  are  the  subject  of 
the  next  section. 

9.3  Cone  bounded  nonlinearities 

Let  N  be  the  set  of  nonnegative  integers,  and  let  O  be  any  set. 


k- 


Definition  9.9  A  unstructured,  memoryless,  nonlinear  operator,  S:  Nx  O  x  Cnd— >Cn% 
is  cone  bounded  (of  size  a)  if  there  exists  a  a  >  0  such  that  for  all  d  6  Cnd,  o  £  O,  and 
all  keN 

II S(k,o,d)  ||  <  a||# 

In  the  definition,  the  set  O  can  represent  dependencies  of  nonlinearity  S  on  other  param- 


Unfortunately,  the  notion  of  a  n  x  n  repeated  scalar,  cone  bounded  operator  is  trickier.  The 
natural  definition  would  involve  a  single  scalar  cone  bounded  nonlinearity,  which  we  would 
then  be  applied  separately  to  each  of  the  n  components  of  the  input  vector.  Unfortunately, 
our  framework  cannot  directly  handle  this,  and  we  must  treat  this  type  of  uncertainty  as  n 
independent,  cone  bounded  scalar  nonlinearities.  So,  when  we  refer  to  a  cone  bounded, 
repeated  scalar  block,  we  in  fact  mean  a  block  of  the  form  7  ( k ,  o )  Jn.  Note  that  7  can  be 
time  varying,  and  depend  on  the  other  parameters  which  the  set  O  represents.  The  key  is 
that  all  n  signals  into  this  block  get  multiplied  by  the  same  scalar  parameter,  namely  7. 

Finally,  a  block  structured,  cone  bounded  nonlinearity  is  the  obvious  block  diagonal  col¬ 
lection  of  several  of  these  blocks.  With  this  definition,  results  similar  to  the  time  varying 
(but  linear)  results  are  possible. 

Theorem  9.10  Let  M  be  given  as  in  (9.18),  along  with  a  block  structure  A.  Suppose 
A:N  x  O  x  Cn*  —>  Cn«  is  a  block  structured,  cone  bounded  nonhnearity,  with  cone  of  size 
1.  If  there  is  a  D  €  T>  such  that  0  =  f3  <  1,  then  the  uncertain  system 

Xk+i  M\i  Mu  Mi  3  Xk 

efc  =  M2 1  M22  M2  3  dk 

zk  _  _  M31  A/32  A/33  .  Wk 

wk  =  A  ( k,ok,zk ) 

is  zero-input,  exponentially  stable,  and  if  x0  =  0,  and  {<4}£10  £  i2,  then  ||e||2  <  (3  ||d||2. 
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10  Optimal  Constant  D  scalings  for  Multivariable 
Systems 

This  section  combines  two  results  from  previous  sections,  to  yield  a  method  for  sub-optimal 
and  optimal  scaling  of  multivariable  transfer  functions  using  constant,  diagonal  D  matrices. 

Let  G(z )  be  a  given,  stable,  transfer  function,  with  m  inputs,  and  m  outputs,  and  state 
space  realization 

G(z)  =  D  +  C  (zl  -  A)'1  B  (10.1) 

Suppose  a  perturbation  structure  A2  is  given,  and  is  compatible  with  G(z).  That  is, 
A2  C  Cmxm.  As  usual,  let  V2  denote  the  set  of  diagonal  scalings  (here,  for  simplicity,  we 
revert  back  to  the  nonexponential  notation  for  the  D' s,  ie.,  the  set  Vg  from  section  (5.1)) 
that  commute  with  all  elements  of  A2. 

Optimal  constant  scaling  is  the  constant  D  scale  that  achieves  the  following  infimum 
(if  it  exists,  otherwise,  a  scaling  that  gets  arbitrarily  close) 

inf  sup  a  (D2G(z)D2  *)  (10.2) 

D2€l>J  zeC 

l*l>i 

Remark:  This  is  useful  because  any  linear  perturbation,  even  a  time  varying  pertur¬ 
bation,  with  the  appropriate  block  diagonal  structure  as  defined  by  A2,  commutes 
with  these  constant  D  scales.  Therefore,  for  every  constant  D  G  V  and  every  op¬ 
erator  A2,  with  the  correct  block  diagonal  structure,  the  following  operators  are  the 
same 

da2i>-1  =  A2 

Therefore,  for  any  operator  G,  the  following  systems  are  equivalent  (any  solution 
to  the  loop  equations  in  one  system  are  also  solutions  to  the  loop  equations  of  the 
other). 


Figure  10.1  Equivalent  Loops 


Simple  application  of  the  small  gain  theorem,  ([Zam]  and  [DesV]),  on  the  right  figure 
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gives  that  if  A2  is  a  stable  operator  mapping  l2  — *■  h,  and  the  induced  norm  of  A2, 
||A2||,  satisfies 

l|A211  <  WDGD-'Wn 

then  the  loop  is  stable.  Hence,  if  we  can  maximize  the  right  hand  side,  this  will 
eliminate  some  of  the  conservatism  in  the  small  gain  theorem  due  to  the  structure 
of  the  perturbation.  This  calls  for  a  minimization  of  the  form 

inf  || DGD-'U 

An  important  point  to  reiterate  is  that  the  D's  are  constant.  If  they  were  frequency 
varying,  then  in  general  they  would  not  commute  with  time  varying  A’s,  and  hence 
the  equivalence  of  the  two  figures  would  be  invalid. 


£ 

kj 


a 
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Refering  back  to  section  8.4  we  see  that,  at  least  conceptually,  Theorem  8.13  gives  the 
value  of  the  infimum.  Here  we  will  capitalize  on  the  additional  structure  that  is  present 
in  this  specific  problem,  and  use  the  result  for  block  structures  with  /  =  s  =  1  which  we 
obtained  in  section  7.2  to  give  a  computationally  tractable  approach. 


First,  let  A  G  CnXn,5  €  Cnxm,C  6  CmXn,  and  D  G  CmXm  be  a  realization  of  G(z).  We 
assume  G  is  stable,  so  p(A)  <  1.  Recall  that  by  inverting  the  Z  transform  variable,  we 
can  rewrite  (10.2)  as 

inf  max  a  (D2FU(M,  (10.3) 

O26P2  AjgAi  '  ' 


where 


A  B 
C  D 


M  :=  [ 

and  Aj  =  {6In  :  8  G  C}. 

Direct  application  of  Theorem  8.13  from  8.4  gives, 

&Lsuv°{D*GWD*x)  =  1 


°s6l5j  *eC 
M>i 


7 


where  7  is  defined  by 
7 

using  the  block  structure 

A  :=  {diag  [*»/„,  A]  :  <5,  G  C,  A  G  CmXm}  . 


/  •  t  (A  BD 2l 

'  S  l  ’  ftek {  aD2C  aD2DD2x 


<  1 


(10.4) 


(10.5) 


(10.6) 


(10.7) 
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This,  A  in  (10.7),  is  precisely  the  structure  we  considered  in  section  7.2,  and  with  respect 

to  this  structure,  yi(M)  =  inf  <j  ( DMD~ 1 ).  Hence  the  quantity  7  in  (10.6)  can  be  defined 

£>€t>  '  ' 

in  terms  the  upper  bound,  instead  of  y.  The  expression  for  7  below  follows  immediately 
by  substituting  the  infimum  for  y  into  (10.6). 


/  .  ,  ■  t  -(  DiADi1  D1BD21 

7  =  sup  <  a  :  mf  inf  a  I  ‘  ~  rUi  ri  o  A-i 

a>o  1  D^c.'D’i  Oi  l  otD2C  D\  aD2DD2 

1  invertible 


(10.8) 


We  state  this  as  a  theorem. 


Theorem  10.1  Let  G(z)  and  A2  be  given  as  in  the  beginning  of  this  section.  Define 
7  €  R  by 


■  t  -  {  DlAD{1  DXBD2X  \  . 

7  :=  sup  <  a  :  mf  cr  ~  r-  n-i  n  n  n-i  <  1 

a>0  1  D\  invertible  V  OlD2CDl  aD2DD2  ) 

\  D2€V7 


(10.9) 


inf  snp  <t  (D2GI(z)D;')  =  k 


Di£D2  z€C 

ld>i 


(10.10) 


How  is  this  computed?  For  a  given  a  >  0,  we  can  find  the  infimum  using  the  descent 
directions  for  a  that  were  presented  in  section  5.3.  Carrying  out  a  one  dimensional  search 
to  find  the  correct  value  of  7  completes  the  calculation. 

The  sufficient  condition  is  easy,  and  follows  directly  from  Lemma  8.11. 


Lemma  10.2  If  there  is  a  diag  [Dl7  D2]  €  V,  and  an  a  >  0  such  that 

.  (  D\ADxX  DxBD2x  \  , 

a  l  aD2CDfl  aD2DD2l,  J  <  1 


WD.GD^Woo  <  -■ 
a 


then 


11  Frequency  domain  techniques 


The  most  well  known  use  of  p  is  as  a  frequency  domain  tool,  specifically,  as  a  generalization 
of  the  singular  value  tools  developed  in  the  late  70’s,  [DoyS].  Singular  values  are  useful 
for  one  full  block  of  uncertainty,  but  are  generally  conservative  when  the  uncertainty  has 
structure  (recall,  for  one  full  block,  /z  =  d,  but  for  other  structures  the  gap  between  p  and 
a  may  be  arbitrarily  large).  Hence  singular  value- like  frequency  plots,  using  p  instead  of 
a  can  handle  structured  unmodeled  dynamics,  [DoyWS]. 

This  section  will  present  a  simple  set  of  modeling  assumptions,  along  with  the  robust¬ 
ness  theorems  that  subsequently  arise.  The  modeling  approach  we  adopt  here  is  quite 
unsophisticated.  This  will  help  us  avoid  more  complicated  topological  issues  of  modeling 
uncertainty,  which  would  take  us  too  far  from  the  spirit  of  the  research.  A  natural  way 
to  view  uncertainty  in  an  individual  component  is  as  follows:  the  only  knowledge  about 
the  actual  component  is  that  it  lies  in  some  predescribed  set  of  possible  components  (the 
use  of  /z  almost  requires  that  the  prescribed  set  be  defined  in  terms  of  a  linear  fractional 
transformation).  Work  by  Vidyasagar  and  [FooP]  has  shown  that  the  set  representing  the 
actual  component  should  be  path  connected  in  the  graph  topology.  The  graph  topology  is 
a  topology  on  the  space  of  proper,  rational  transfer  matrices.  It  was  introduced  in  [Vid], 
and  is  best  characterized  in  terms  of  coprime  factorizations.  We  would  like  to  bypass  this 
issue,  since  it  is  not  central  to  the  ideas  here.  Moreover,  obtaining  necessary  conditions 
for  robust  stability  is  much  less  understood  in  this  framework.  Consequently,  we  will  be 
content  with  the  simplified  uncertainty  modeling  presented  here.  Fortunately,  in  either 
approach,  the  robustness  test  (using  a  /z  framework)  will  still  involve  calculating  p  on  a 
specific  nominal,  closed  loop  transfer  function. 

Apart  from  the  differences  in  time  domains  (continuous  versus  discrete),  the  results  of  this 
section  axe  entirely  equivalent  to  those  from  section  9.  In  effect,  we  replace  the  single  /z 
test  of  Theorems  9.1  and  9.7  with  a  frequency  varying  p  test  on  a  smaller  matrix  and  cor¬ 
respondingly  smaller  block  structure.  This  is  possible  via  the  maximum  modulus  result, 
Theorem  8.6.  In  spite  of  this  mathematical  equivalence,  the  results  in  this  section  are  de¬ 
rived  using  a  Nyquist-based  argument,  which  is  consistent  with  the  historical  development 
of  these  rohn«t.np*«  methods. 

We  begin  with  some  well  known  results  on  the  stability  of  feedback  loops. 
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11.1  Stability  of  Feedback  Loops 

Consider  two  finite  dimensional,  linear,  time  invariant  systems  described  by  the  equations 

Xi  =  Atx,  +  B{Ui 

yi  =  CiX{  +  DiUi 

Assume  that  the  number  of  outputs  in  system  1  equals  the  number  of  inputs  to  system  2, 
and  vice  versa.  Hence  DXD2  is  a  square  matrix,  and  we  assume  that  I +  DXD2  is  invertible. 
Let  Mi(s)  denote  the  transfer  function  of  each  system. 

Suppose  that  for  each  i  =  1,2,  the  pair  (Ax,Bi)  is  stabilizable,  and  the  pair  (A,-,  C,)  is 
detectable.  Consider  the  interconnection  ux  =  y2  +  vx;  u2  =  v2  —  yi  shown  below. 


Figure  11.1  Feedback  Interconnection  of  Two  S  ystems 
Then  the  internal  dynamics  of  the  interconnection,  which  are  governed  by  the  matrix 

'  A1-B1(I  +  D2D1)-1  D2Ci  Bx(I  +  D2Dx)~1C2 
-B2  (/  +  DxD2)~l  C\  A2  -  B2  (/  +  DxD2)~l  D1C2 


has  all  of  its  poles  in 


are  stable  if  and  only  if  the  transfer  function  from  Ul  to  ^  has  all  of  its  poles  in 

*>2  yi 

the  open  left  half  plane  (proper,  rational,  transfer  functions  with  all  poles  in  the  open  left 
half  plane  will  be  referred  to  as  stable).  This  is  easy  to  verify  by  showing  that  the  internal 
dynamics  are  stabilizable  from  v,  and  detectable  from  y. 

Theorem  11.1  If  both  Mi  and  M2  are  stable,  then  the  interconnection  is  stable  if  and 
only  if  (I  +  Mi(s)M2(s))~1  is  stable. 

Proof:  All  four  of  the  transfer  functions  are  linear  combinations  of  I,  Mi,  M2,  and 
(I  +  M\M2)~ \  hence,  if  these  separately  stable,  all  4  of  the  transfer  functions  are. 
Conversely,  (/  +  Mi(s)M2(s))~1  is  equal  to  I  —  HyiiVJ,  where  HyuU2  is  the  transfer 
function  from  v2  to  y\.  Hence  (I  +  necessarily  is  stable  if  the  interconnec¬ 

tion  is.  J 

Alternatively,  we  have  the  multivariable  Nyquist  test,  which  in  the  case  that  both  systems 
are  stable,  has  a  particularly  simple  form. 


i 


tel, 


t! 


Theorem  11.2  Suppose  both  Mi  and  M2  are  stable.  The  interconnection  is  stable  if  and 
only  if  the  Nyquist  plot  of  det  (I  -f  Mi(juj)M2(jiv)) ,  does  not  pass  through  or  encircle  the 
origin,  as  u>  varies  from  —  00  — ►  00. 

11.2  Representing  unmodeled  dynamics 

In  this  section  we  describe  a  simple  set  of  assumptions  for  modeling  components  with 
unmodeled  dynamics.  As  mentioned  earlier,  similar,  but  more  sophisticated  assumptions 
exist,  [F00P]. 

Consider  a  “two”  input,  “two”  output  system  G  described  by  the  following  state  space 
equations 

’  i  1  A  Bi  B2  1  [  x  ' 

2 It  —  C\  Du  D\2  u  1  (11-1) 

.2/2.  .  C2  D21  D 22  .  .  u2  . 

where  A  E  RnXn,R,  €  Rnx  €  Rn«x",A;  6  R"»-Xn“*.  We  will  use  this  state 

space  description  to  represent  an  uncertain  component.  We  begin  with  the  following 
assumptions: 

•  The  nominal  model  for  this  component  is  given  by  the  quadruple  (A,  B2,  C2,  D22). 
The  pair  (A,  B2)  is  stabilizable,  and  the  pair  (A,C2)  is  detectable. 

•  &{Dn)  <  1. 

The  uncertainty  in  the  component  will  of  course  be  parametrized  by  a  linear  fractional 
transformation.  Let  A  be  any  given  block  structure,  with  overall  dimensions  nUj  x  nyi. 
With  respect  to  this  A,  define  the  following  set  of  state  space  quadruples 

T^A  =  |(A,R,  C,  D}  :  A  is  stable,  D  4-  C  (jul  —  A)  B  6  A  for  all  u>  E  r|  (11.2) 

where  the  matrices  are  A  €  RmXm,  B  E  RmXn»i  ,C  E  R,luixm,L>  E  Rnuixn*i  and  m  ranges 
over  all  nonnegative  integers.  Furthermore,  define  a  subset  of  7 as 

B7^a  :=  j  (A,  5,  C,  Zl)  E  Ka  '■  supd-  +  C  (jul  -  a)  Bj  <  lj  (11.3) 

The  set  of  components  that  the  pair  (G,  71  a)  define  are 

x  A  -f  B2DZC\  B\1VC  B2  4-  B\DZDi2  x 

C  =  BZCi  A  +  BZDnC  BZDU  C 

y  .  C2  +  D21DZC1  D21WC  D21  +  DnDZDn 


u 


(11.4) 
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where  Z  :=  (i  —  Dnl>)  1  and  W  :=  (i  —  DDn'j  ,  and  the  quadruple  ( A ,  5,  G,  £>)  G 


In  a  diagram,  this  is  just  Fu  (G(s),  A(s)),  where  A(s)  =  D  +  C  (fl  —  A)  B.  This  i 
shown  below. 


is 


Figure  11.2  General,  Uncertain  Component  Model 

This  is  the  general  model  we  will  use  for  a  component  with  structured,  unmodeled  dynam¬ 
ics. 


Remark:  To  simplify  the  discussion,  we  will  treat  the  perturbations  as  actual  components. 
This  is  implicit  in  the  state-space  manner  that  we  have  written  the  perturbed  com¬ 
ponent.  That  way,  we  avoid  the  technical  dilemma  which  occurs  when  modeling, 
say,  the  constant  component  5  =  1  using  a  LFT  on  G  defined  as 


G(s)  := 


3  +  a 
b  1 


(11.5) 


Note  that  regardless  of  A(s),  the  linear  fractional  transformation  Fu  (G,  A)  =  1,  so 
that  this  does  indeed  represent  the  constant  component  5=1.  However,  the  p  test 
we  are  about  to  describe  would  give  that  the  uncertainty  in  this  component,  if  large 
enough,  can  cause  instability  in  any  closed  loop  system  with  this  component.  If  we 
treat  the  uncertainty  as  components,  then  this  interpretation  is  correct. 


Finally,  we  define  an  uncertain  plant  as  a  linear  interconnection  of  uncertain  components, 
that  is  itself  an  uncertain  component.  Therefore,  through  its  actual  inputs  and  outputs,  the 
dynamics  are  stabilizable  and  detectable.  A  collection  of  uncertain  components  defines  a 
new  uncertainty  structure  that  also  has  the  block  diagonal  form.  Simply  by  reordering  each 
of  the  separate  uncertainties,  we  can  assume  the  structure  is  like  that  defined  in  section  3. 
The  plant  also  has  a  multivariable  exogenous  disturbance  and  multivariable  error.  These 
are  additional  injections  to  the  component  dynamics,  and  various  internal  signals  from  the 
components.  Hence,  the  uncertain  plant  is  described  by  a  known  dynamical  system  P(s), 


l.l  *a|  k.*  i 


lilt.f 'Lk'l.t'l 


and  a  given  uncertainty  set  A.  In  particular,  P  is  described  by  the  state  space  equations 

x  A  Bi  B2  B3  x 

2/i  _  Ci  Du  D12  Diz  ui  .  . 

2/2  C 2  D?i  D22  D23  U2 

yz  C3  Z?3i  D32  D33  u3 


where  A  is  stabilizable  through  B3,  and  detectable  via  C3.  Let  K  stabilize  the  nominal,  ie. 
K  stabilizes  the  dynamic  system  described  by  A,  B3,  C3,  D33.  The  diagram  below  shows 
the  perturbed  plant  with  controller  K.  The  signal  U3  is  the  manipulated  variable,  and  this 
depends  on  the  measurements,  y3,  via  the  control  law  u3(s)  =  K(s)y3(s).  The  signal  u2  is 
the  exogenous  disturbance,  and  t/2  is  the  error.  A  stable,  finite  dimensional  A(s)  €  B7£a 
is  the  perturbation,  and  this  relates  Ui  to  2/1  via  the  “feedback”  t*i(s)  =  A (s)yi(s). 


Figure  1 1 .3  Perturbed  Plant  with  Feedback  Controller 

What  questions  would  we  like  to  answer? 

•  determine  whether  the  closed  loop  is  stable  for  all  stable  A(s)  €  BF.^,  and 

•  if  so,  determine  how  large  (in  II  •  Hoc  norm)  the  perturbed  disturbance  to  error  map 
will  get. 

11.3  Frequency  domain  robustness  tests 

We  have  the  following  facts/assumptions: 

•  The  controller  stabilizes  the  nominal,  hence  the  internal  dynamics  of  F;  ( P,K )  are 
stable.  Let  M(s)  :=  Fi(P(s),  K(s)),  the  closed  loop  transfer  function  from  (u!,u2) 
to  (1/1, 2/2).  The  perturbed  disturbance-to-error  transfer  function  is  Fu(M(s),  A(s)). 

•  The  perturbations  are  themselves  viewed  as  stable  components.  Therefore,  the  per¬ 
turbed  closed  loop  is  stable  if  and  only  if  the  transfer  function  (7  —  Mu(s)A(s))~1  is 
stable.  As  we  shall  now  see,  this  can  be  readily  cast  as  a  p  test  on  the  loop  transfer 
function  Mn(jui). 


WWW 


Theorem  11.3  (Robust  Stability)  The  perturbed  closed  loop  is  stable  for  all  A(-)  6 
B7 Z&,  if  and  only  if  sup  ( Mu(ju ;))  <  1. 


Proof: 

<—  As  we  have  pointed  out,  we  need  only  check  the  stability  of  the  transfer  function 
(I  —  Afn(s)A(s))~1  for  each  A(s)  €  B7^a>  from  Theorem  11.1.  Let  A(s )  be  an 
arbitrary  element  of  B7^a>  and  suppose  sup  (Mn  (juj))  <  1.  Both  are  sta- 
ble,  therefore  using  Theorem  11.1,  we  only  need  to  show  that  the  Nyquist  plot  for 
(/  +  Mn(ju})A(ju>))  does  not  pass  through  or  encircle  the  origin.  For  £  >  0,  but  suffi¬ 
ciently  small,  the  interconnection  of  Mu  and  I A  will  be  stable  by  continuity  of  eigen¬ 
values  (or  small  gain  theorem).  Hence  the  nyquist  plot  of  (I  +  eMn(jui)A(juj))  must 
not  pass  through  or  encircle  the  origin.  For  every  e  6  [£,  1]  and  every  u>  6  [— oo,  oo], 

l<‘l  +  (11-7) 

Setting  e  =  1  in  (11.7)  gives  that  the  Nyquist  plot  for  (/  +  Mu(ju/)A(juj))  does  not 
pass  through  the  origin.  But,  it  cannot  encircle  the  origin  either.  To  see  this,  recall 
that  for  small  enough  e,  it  did  not  encircle  the  origin.  As  e  1,  the  Nyquist  curve 
of  (/  +  eA/u(jw)A(jtj))  deforms  continuously  with  e,  and  (11.7)  guarantees  that  it 
never  passes  through  the  origin.  This  implies  that  the  number  of  encirclements 
must  stay  the  same,  namely  zero,  so  the  actual  perturbed  loop  (e  =  1)  is  indeed 
stable.  A  rigorous  homotopy  argument  for  this  deformation  proof  can  be  found  in 
[CheD], 

— +  Suppose  sup  p  [Mn  (ju)]  >  1.  Then  for  some  finite  6  R,  p  [ Mu(ju> )]  >  1.  Choose  a 

U > 

constant,  complex  matrix  Ac  6  A  such  that  det  (/  +  Mu(ju>)Ac)  =  0,  and  d  (Ac)  < 
1.  This  is  always  possible.  Then  the  interconnection  with  Mn  and  Ac  has  a  pole  at 
s  =  j u>.  It  is  a  fairly  simple  task  [CheD]  to  find  a  A(s)  €  "BIZ^  that  interpolates  Ac 
at  s  =  j u>.  This  choice  for  A(s)  destabilizes  the  loop,  and  completes  the  proof.  J 

Next,  we  answer  the  question  of  robust  performance  -  “How  large  does  the  perturbed 
disturbance-to-error  map,  Fu  ( M(s ),  A(s))  get  as  A  takes  on  various  values  in  TliA? 

Theorem  11.4  (Robust  Performance)  Let  P  be  an  uncertain  plant  as  defined  in  (11.6), 
A  be  a  given  uncertainty  structure,  and  K  be  a  LTIFD  controller  that  stabilizes  the 
nominal  part  of  P,  ie,  K  stabilizes  the  quadruple  [A,  B3,  C3,  D33).  Define  an  augmented 
structure  A  as 


A  :=  jdiag  [A,  A2]  :  A  €  A,  A2  €  Cn“2  ^  j 


(11.8) 
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so  that  A  is  compatible  in  dimension  to  M  ( juj )  :=  F*  (P,  F)  (jw). 

Then,  the  perturbed  closed  loop  is  stable,  and  || Fu  (M,  A)  <  1  for  all  A(s)  £  B71a  if 
and  only  if  sup  p^  ( M(ju ))  <  1. 


Proof: 


First,  we  always  have 


sup/iA  (Mn(jw))  <  sup  ^ a  (MUu))  <  1 


(11.9) 


so  using  the  Robust  Stability  theorem,  for  all  such  A(s),  the  perturbed  loop  is 
stable.  Let  A(s)  €  BT^Ai  and  let  u ;  be  arbitrary.  Note  that  A (jui)  £  A  and 
<7(A(ju;))  <  1.  Since  p^  ( M(ju ;))  <  1,  Theorem  8.3  implies  that 


a(Fu(M(j<jj),A(ju)))  <  1 

Therefore,  for  such  a  A(s),  we  get  that  ||FU  (M,  A)  ||oo  <  1. 


(11.10) 


Suppose  that  sup  p ^  ( M(ju> ))  >  1.  If,  in  fact,  sup  p&  (Mn(ju))  >  1,  then  the  loop  can 

U>  UJ 

be  destabilized  using  an  element  of  BTJ-a  as  described  in  the  proof  of  Theorem  11.3 

A  /\  Q  A  /  A  \ 

Otherwise,  choose  a  finite  w  £  R  and  A  :==  £  A  such  that  a  (A)  <  1 

U  ^2C  '  ' 

and 

det  [i  —  M(ju)  A)  =  0  (11.11) 

Again,  use  the  results  in  [CheD]  to  interpolate  a  stable,  rational  A (s)  such  that 
||A(s)||oo  <  1  and  Ac  =  A(ju>).  Then, 


I  0  _  Mn(ju>)  Af12(;u>)  A(j'u;)  0 

0  I  M22{j&)  0  A2c 


Since  <j(A2c)  <  1,  (11.12)  implies  that 


&  {Fu  A(juj)))  >  1 


(11.12) 


(11.13) 


which  proves  the  desired  result.  # 


These  theorems  can  also  be  scaled  so  that  the  bound  on  robustness  is  not  1,  but  some 
other  positive  number.  The  details  are  the  same,  using  the  basic  ideas  from  the  theorems 
in  section  8.2. 
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12  Counterexamples  showing  that  /i  need  not  equal 
the  upper  bound 

This  section  shows,  via  two  detailed  examples,  that  /*  (M)  is  not  always  equal  to  the 
a  (DMD~1)  upper  bound.  An  appealing  aspect  of  these  examples  is  their  simplicity,  each 
using  only  elementary  linear  algebra. 

12.1  2  repeated  scalar  blocks 

We  begin  with  the  block  structure  s  —  2  and  /  =  0.  We  use  the  results  from  section  9  on 
uncertain  difference  equations  to  derive  the  counterexample. 

12.1. a  Let  a  6  (0, 1)  and  y  €  (0, 1)  be  given.  Define  the  matrix  M  E  R4x4  by 


(12.1) 


Define  a  block  structure  A  :=  {6I2X2  '■  $  £  C}.  We  will  investigate  the  stability  of 
the  difference  equation 

xk+i  =  Fi(M,A)xk  (12.2) 

with  various  assumptions  on  the  uncertainty  A  E  A.  Recall  that  the  results  of 
section  9  addressed  just  this  problem. 
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12.1.b  For  all  A  G  BA  the  LFT  Fi  (M,  A)  is  well  defined,  and  appears  as 

[0 

1  +  ad  1 


l+a6 
-  7  1  -a6 


(12.3) 


Note  that  for  each  such  A,  the  spectral  radius  of  Fi(M,  A)  is  simply  y/y,  which 
by  assumption  is  less  than  1.  Therefore,  for  fixed,  but  unknown  uncertain¬ 
ties,  A  G  BA,  the  system  in  equation  (12.2)  is  stable.  Consequently,  with  respect 
to  the  structure  A  :=  {diag  [<5/2X2i  A]  :  6  E  C,A  G  A},  Theorem  9.1  implies  that 
1- 


p 

I 


12.1.C  Consider  the  time  varying  system 


*k+ 1  =  K(M,  Ak)xk 


(12.4) 


where  A*  €  BA  for  each  time  step  k.  Take  A*  :=  Iiy.2  when  k  is  even,  and 
At  :=  — J2X2  when  k  is  odd.  Then  for  k  even,  x*,+2  depends  on  by  the  relation 


xk+2  — 


1  +  a 


1  —  a 


1  —  a 


1  "t"  a 


which  simplifies  to 


xk+ 2  — 


7 


(1  +  a)2 


(i  -  «r 

0 


1  -  a  1 


7 


1  +  a 


1  -h  a 


1  —  a 


xk 


(12.5) 


7 


(1-a)2 


(1  +  «) 


xk  ■ 


(12.6) 


For  any  7  €  (0, 1),  it  is  easy  to  chose  a  €  (0, 1)  so  that  (12.6),  and  hence  (12.4),  is 
unstable.  For  such  choices  then,  we  must  have 

mf  a  (pMD-1)  >  1  (12.7) 

otherwise,  by  9.1.c,  the  time  varying  system  in  (12.4)  would  be  stable  for  A*  6 
BA,  regardless  of  the  variation  with  k. 

Remark:  A  bit  more  analysis  can  show  that  by  proper  choice  of  7  and  a,  the  value  of 

inf  <7  (DM 
Den  v  ' 

can  be  made  arbitrarily  close  to  1  +  \/2  while  n^(M)  <  1. 


12.2  1  repeated  scalar  block ,  2  full  blocks 

Next  is  an  example  for  a  block  structure  with  s  =  1  and  /  =  2.  It  is  broken  down  into  8 
facts. 


12. 2. a  First,  let  A  =  (diag  •'  <$»  6  C}.  Then  (with  respect  to  this  structure)  for 

any  complex  r  0, 

^  T  6  ]  =  1 

This  follows  as  a  special  case  of  Theorem  3.4. 


12. 2. b  Let  a  G  C  with  |a|  <  1.  Define  G  on  |<5|  <  1  as 


G(S)  = 


0 

1— a# 
.  l+a<5 


(12.8) 


Note  that  everywhere  in  the  unit  disk,  G  is  defined  and  looks  like 
from  12.2.a 


0  Ii 

T  0 


sup  n  [G($)J  =  1 
l*l<i 


Hence 

(12.9) 


12. 2. c  G  in  (12.8),  is  a  linear  fractional  transformation.  In  particular,  define  the  matrix 
M  by 


M  := 


a  0  2a  0 
0  — a  0  —2a 

0  10  1 
10  10 


(12.10) 


It  is  simple  to  verify  that  for  each  |6|  <  1,  G(S)  =  Fu  (M,  SI2x2). 


12. 2. d  Define  =  {SI2x2  :  6  €  C},  and  A2  =  {diag  [(51(  62]  :  6i  G  C}  .  Certainly  (M) 
makes  sense  (dimensions  are  compatible),  and  p.\)2  (M)  >  1,  since  /i2  (M22)  =  1. 
Using  (12. 2. b)  and  (12.2.c),  and  Theorem  8.3,  with  &  —  1,  gives  ^li2  (M)  <  1. 
Therefore  /i1)2  (M)  =  1. 


12. 2. e  Define  the  correct  scaling  sets  Vx  and  V2  compatible  with  Ax  and  A2.  For  any 
0  >  1,  and  any  D2  G  V2 

sup  d(D2Fu(M,6I2x2)D2l)  >  (12.11) 

l«l<fr  '  0-\a\ 


This  follows  from  the  fact  that  for  any  D2  G  V2, 


D2Fu(M,6I2x2)D2 1 


0 

d2  1  —  a6 
di  1  +  ad 


d\  1  +  a8 
d2  \  —  aS 
0 


(12.12) 


and  from  the  behavior  of  the  nonzero  elements  of  G(S)  on  the  edge  of  disks  of  radius 
— ,  which  is  shown  in  the  figure  below. 
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Figure  12. 1  Magnitudes  of  Nonzero  Elements  of  G( 8) 
12. 2. f  Fact:  Let  7  >  0.  If  there  is  a  Ax  £  Ax,  <7(AX)  <  A  such  that 

•  /  —  Mu  Ax  is  invertible 

•  cr  [Fu  (A/,  Ax)]  >  7 

then 


,&,*[(  0  ?)(&  $;)(T  ?)]**  <i2i3> 


This  fact  is  simply  the  contrapositive  of  Lemma  8.11. 

12. 2. g  If  we  choose  a  0  >  1  such  that  >  /?,  then  we  can  apply  the  results  from 
(12.2.e)  and  (12.2.f)  above  to  conclude  that 


inf  a 
D\  ,D 1 


A  0  \f  Mn  Mu  \  (  A-1  0 
0  A  /  v  aa  Mu  )  l  0  a  1 


(12.14) 


The  logic  is  as  follows:  first  suppose  0  is  chosen  so  that  |^|||  >  0.  Then  from 
equation  (12.12)  we  know  that  for  every  D2  €  V 2,  there  is  a  6  €  C  with  )<5J  <  j  such 
that 

*  (AFu  (A/,  <5/2x2)  A'1)  >  &  (12.15) 


This  satisfies  the  conditions  of  (12.2.f),  therefore,  for  each  A  £  A 

inf  afr  0  V  M"  \(  D''  0  \]  >  8 

D\g.t>\  \  0  A  /  \  A/ji  A/22  )  \  0  A*  / 


(12.16) 
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Carrying  out  the  infimum  over  T>2  in  (12.16)  yields 

mf  a  (dMD~x)  >  0  (12.17) 

Therefore  the  question  becomes:  “What  is  that  largest  /3  such  that  |±j|i  >  fj  ?”  Sim¬ 
ple  algebra  gives  that  this  largest  /?  is  /?  =  +6l°l+1j  jsj0^e  that  as  |a|  /*  1  , 

the  quantity  (3  /  1  +  \/2. 

12. 2. h  In  summary:  Let  e  >  0.  Choose  a  £  C,  |a|  <  1  such  that 

|a|  +  1  +  v/|a|2  +  6|a)  +  1  r 

— - - — - >  1  +  y/2  —  e.  (12.18) 

Define  M  as  in  (12.10).  Then,  with  respect  to  the  augmented  structure  described  in 
(12.2.d),  p  (M)  =  1  but  Jnf^  (DMD-1)  >  1  +  y/2  —  e. 

This  example  eliminates  many  other  block  structures  as  well.  Since  the  full  blocks  were 
1  X  1  in  this  example,  they  may  be  viewed  as  repeated  scalar  blocks  instead.  Therefore, 
this  counterexample  works  for  s  =  2, /=1,  and  s  =  3,/  =  0  too. 


12.3  Conclusions 

In  light  of  this  example,  it  appears  that  the  upper  bound  can  be  quite  far  from  the  actual 
value  of  p,  especially  when  3^0.  For  instance,  in  this  example,  the  upper  bound  (in  the 
limit)  equals  (l  -f  y/2^  x  p.  Limited  computing  experience  with  uncertainty  structures 
having  s  ^  0  indicates  that  there  is  often  a  gap,  though  usually  not  as  large.  For  block 
structures  with  no  repeated  scalar  blocks,  s  =  0,  this  contrasts  directly  with  our  computa¬ 
tional  experience.  In  that  case,  the  worst  known  ratio  of  upper  bound  to  p  is  1.14,  [MorD], 
and  usually,  it  is  much  closer  to  1.  Given  that  the  upper  bound  can  be  computed,  and  in 
general,  it  is  impossible  to  verify  that  a  lower  bound  is  indeed  p,  how  should  this  all  be 
interpreted? 

Suppose  an  uncertainty  structure  has  only  full  blocks,  and  the  perturbations  are  modeled 
as  linear,  time  invariant.  Using  the  constant,  state  space  p  test  in  [DoyP]  requires  that  the 
actual  uncertainty  structure  be  augmented  with  a  large  (size  of  state  dimension)  repeated 
scalar  block.  In  view  of  the  counterexample,  it  is  likely  that  the  upper  bound  will  not 
equal  p,  and  the  conclusions  will  be  conservative.  In  this  situation,  a  frequency  domain 
upper  bound  test,  [DoyWS],  is  appropriate,  since  it  scales  (a  peak  >  1  does  give  useful 
information),  and  with  this  block  structure,  we  always  have  found  p  and  the  upper  bound 


very  close.  It  is  important  to  realize  that  the  frequency  domain  test  only  gives  conclusions 
about  linear,  time  invariant  perturbations. 

If  the  perturbations  are  time  varying  and/or  nonlinear,  then,  in  general  the  frequency 
domain  tests  are  not  valid,  though  [Saf2]  derives  conditions  on  the  frequency  dependent 
scalings  which  allow  for  conclusions  about  slope  bounded  nonlineaxities.  The  upper  bound 
approaches  based  on  constant  matrix  operations  (for  example,  the  optimal  constant 
scaling,  section  10),  handle  this  type  of  uncertainty,  and  the  motivation  which  led  to 
their  development  was  the  relationship  between  n  and  the  upper  bound,  and  the  role  this 
difference  plays  in  the  behavior  of  linear  fractional  transformations. 


13  A  power  method  for  the  structured  singular  value 


This  section  presents  an  iterative  algorithm  to  compute  lower  bounds  for  the  structured 
singular  value.  The  algorithm  resembles  a  mixture  of  power  methods  for  eigenvalues  and 
singular  values,  which  is  not  surprizing,  since  the  structured  singular  value  can  be  viewed 
as  a  generalization  of  both.  If  the  algorithm  converges,  a  lower  bound  for  p.  results.  We 
prove  that  p  is  always  an  equilibrium  point  of  the  algorithm,  however,  since  in  general 
there  are  many  equilibrium  points,  we  also  discuss  heuristic  ideas  to  achieve  convergence. 

In  [FanT],  the  calculation  of  p  is  reformulated  as  a  smooth  optimization  problem.  As  with 
all  of  the  known  exact  expressions  for  p,  the  function  that  is  to  be  maximized  has  local 
maximum  which  are  not  global,  so  in  general  the  method  yields  only  lower  bounds  for  p. 
Similar  comments  can  be  made  for  the  ideas  in  [Doy]  and  [Hel],  as  well  as  the  algorithm 
in  this  paper.  The  contribution  here  is  yet  another  lower  bound  algorithm  to  aid  in  the 
analysis  of  robustness  of  systems  with  structured  uncertainty.  This  section  addresses  the 
lower  bound,  and  develops  a  power  algorithm  aimed  at  quickly  finding  local  maximums  of 
r:BA-*R,  defined  by  r( A)  =  p(AM).  Some  of  the  results  are  generalizations  of  those 
found  in  [DanKL]. 

Since  we  will  be  interested  in  local  maximums  of  the  function  r(A)  =  p  (AM),  we  be¬ 
gin  with  some  facts  from  perturbation  theory,  which  will  assist  in  characterizing  local 
phenomena. 

13.1  Matrix  Facts 

13.1.1  Derivatives  of  eigenvalues 

In  this  section  we  review  the  differentiablity  properties  of  eigenvalues  and  eigenvectors  of 
matrices  depending  analytically  on  a  real  variable.  All  material  comes  from  [Kat], 

Suppose  M : R— >  CnXn  is  an  analytic  function  of  the  real  parameter  t.  If  A0  is  a  eigenvalue 
of  M0  :=  Af(0)  of  multiplicity  one ,  then  for  some  open  interval  containing  0,  this  eigenvalue 
is  a  analytic  function  of  t,  as  are  the  eigenvectors  associated  with  it.  That  is,  suppose  there 
are  nonzero  x„,y0  €  Cn,  satisfying 

yoxo  =  1 

M0x0  =  \0x0  (13.1) 

Mo  Po  =  ^ oVo 

Then  there  is  an  e  >  0  and  analytic  functions  x:  (— e,  e)  — » Cn,  y :  (— e,  e)— ►  Cn,  A:  (— e,  e)  — > 


C  such  that  for  all  t  E  (— e,  e) 

y-x  =  1 
Mx  =  Ax 
M’y  =  A  y 

This  follows  from  [Kat].  Hence,  we  can  differentiate  and  obtain 

A(0)  =  y*M{0)xo 


(13.2) 


(13.3) 


13.1.2  Linear  algebra  lemmas 

The  next  two  lemmas  are  elementary  linear  algebra.  They  will  be  used  in  the  main  theorem 
of  the  next  section. 

Lemma  13.1  Let  y,x  E  Cn  with  y  ^  0  a ad  x  ^  0.  There  exists  d  E  R,  d  >  0,  such  that 
y  =  dx  if  and  only  if  Re  {y'Wx)  <  0  for  every  W  E  CnXn  satisfying  W  +  W*  <  0. 

Proof:  The  “only  if”  is  obvious,  so  we  just  prove  the  “if”.  As  usual,  let  y,  and  x,  denote 
the  z’th  element  of  y  and  x,  and  W,j  denote  the  i,j  element  of  W  E  Cnxn.  Begin 
by  letting  W  be  zero  everywhere,  except  in  the  i,i  element,  and  set  Witi  =  o ,  +  jwi 
for  some  ov  <  0  and  u>,  E  R.  Obviously  W  satisfies  the  hypothesis.  Then 

Re  {y'Wx)  =  <r <  Re  (y.x.)  -  u \  Im  (t/.x.) 

If  Im  (jjiXi)  ^  0,  then  it  would  be  possible  to  choose  E  R  to  violate  the  Re  {y'Wx)  < 
0  hypothesis.  Hence  Im  (y,x,)  =  0.  Similarly,  with  the  only  restriction  on  <7,  being 
<Ti  <  0,  we  must  have  Re  (y,xt)  >  0.  Therefore,  for  each  t,  we  can  write 

yi  =  Sie :9i 
Xi  —  rid*' 

where  s,  >  0,  r,  >  0,  and  E  R.  From  the  above  discussion,  it  is  clear  that  for 
each  i, 

s,  =  0  or  r<  =  0  or  0;  =  (13.4) 

Now,  let  l  ^  k  be  two  integers  <  n.  Let  w  6  R  be  arbitrary.  Define  a  matrix  W 

by  W/( k  :=  ~e~^,  Wkj  :=  e and  zero  everywhere  else.  Note  that  W  +  W'  =  0,  so 
trivially  W  satisfies  the  hypothesis.  In  this  case 

Re  {y'Wx)  =  — s/r*  cos  {6k  —  x}>i  —  u>)  +  skri  cos  {&i  —  ij>k  +  u>) 

Since  u>  is  free  and  neither  x  or  y  is  0,  we  have  for  all  i,  s,  =  0  if  and  only  if  r,  =  0. 

Consequently,  suppose  that  sk  ^  0  and  si  ^  0.  Recall  from  (13.4)  that  this  means 


Sw 
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m  M  : Li 


ipk  =  &k  and  ipi  =  di.  We  claim  that  strk  =  sfcr;.  To  see  this,  suppose  instead  that 
skri  ^  sirk.  By  choosing  u  :=  9k  —  &i  or  u  :=  it  +  9k  —  Ot,  we  get  that 

Re  (y*Wx)  =  |s*;rj  —  sirk\  >  0 

This  contradicts  the  original  assumptions,  hence  we  must  have  S(rk  =  s*rj.  Therefore 

£/  _  £ * 

H  rk 

for  every  k  ^  /  with  s*  7^  0  and  si  ^  0.  Define  d  >  0  to  be  this  ratio.  For  every  i, 
we  have  y,  =  da:,  as  desired.  )j 


Lemma  13.2  Let  a  and  b  be  two  nonzero  vectors  in  Cn.  Then  there  exists  a  hermitian, 
positive  definite  D  G  C"Xn,  such  that  Db  =  a  if  and  only  if  b*a  G  (0, 00). 

Proof:  Again,  the  “only  if”  is  easy.  Conversely,  suppose  that  |)1>||  =  1.  If  not,  simply 
scale  appropriately.  Let  Bx  G  Cnx(n-1)  such  that  the  matrix  K  :=  [bB±]  G  Cnxn  is 
unitary.  Decompose  a  in  this  basis,  ie.  find  a  scalar  a  G  C  and  £  G  Cn-1  such  that 

a  =  ab  +  BxC 

By  assumption,  a  is  real  and  positive.  Let  W  €  C(n-1^x^n_1^  be  any  hermitian  matrix 
such  that  W  —  !•££*  is  positive  definite.  It  is  simple  to  check  that 

D-=K[°  £]*• 

works,  jt 


13.2  Decomposition  at  n 

We  need  to  define  a  set  T>,d,  similar  to  V  from  section  3.1.  It  is  the  same  as  V,  except  the 
elements  are  restricted  only  to  be  positive  semi-definite,  rather  than  positive  definite. 

V3d  =  {diag  [Du . . . ,  D„  djmi dfImf]  :  D,  G  Cr-Xr‘,  D,  =  D*  >  0,  d,  €  R,  d,  >  o} 

(13.5) 

Theorem  13.3  Let  M  G  Cnxn  be  given,  and  suppose  \0  >  0  is  a  distinct  eigenvalue 
of  M,  with  right  and  left  eigenvectors  x  and  y  respectively,  and  y'x  =  1.  Suppose  that 
p{M)  =  \0.  If  the  function  r:  BA-+R  defined  by  r( A)  =  p(AM)  has  a  local  maximum 
( with  respect  to  BA)  at  A  —  I  then  there  exists  a  D  G  V,d  such  that  y  =  D2x. 


I 


$ 

U 


V '.'s' 


Proof:  Let  G  £  A  with  G  +  G*  <  0.  Obviously,  G  appears  as 

diag  [giIri,...,g1Ir,,Gu...,Gf]  (13.6) 

where  i?e(p,)  <  0,  and  Gj  +  G*  <  0  for  all  i  and  j.  Obviously,  at  t  =  0,  eGt  =  /,  and 
eGt  £  BA  for  all  t  >  0.  Define  a  matrix  function  W :  R— ►  CnXn  by  W(t)  :=  eGtM. 
Note  that  at  t  =  0,  A„  is  a  simple  eigenvalue  of  W(0),  with  x  and  y  the  right  and 
left  eigenvectors.  For  some  nonempty  interval  containing  0,  this  eigenvalue  is  always 
simple,  and  hence  there  is  an  analytic  function  of  the  real  variable  t,  A(t),  defined 
on  that  interval,  such  that  \(t)  is  and  eigenvalue  of  W(t)  for  all  t  and  A(0)  =  A0.  It 
is  easy  to  calculate  A(0),  namely 


A(0)  =  y*W{  0)x  =  A  0y*Gx 


(13.7) 


By  hypothesis,  A„  >  0,  p(M )  =  A0  and  the  function  p  (AM)  has  a  local  maximum  at 
A  —  I.  Therefore 

Re(±X(t) 


1=0 


<  0 


(13.8) 


which  says  that  the  magnitude  of  A  must  be  nonincreasing  at  t  =  0.  Partition  x  and 
y  compatibly  with  the  block  structure  A, 


i  = 


'  Vn 

yr3 

®r. 

II 

Vr. 

Xmi 

Vmi 

Vm-i 

.  ® m /  . 

.  Vrrif  . 

(13.9) 


where  xri,yri  6  Cr’  and  xmj,ymj  £  Cm>  for  each  i  and  j.  Using  this  “block  notation”, 
and  substituting  (13.6)  and  (13.7)  into  (13.8)  yields 

Re  {ibw'r.Xu  +  <0.  (13.10) 

This  must  hold  for  arbitrary  G  €  A  satisfying  G  +  G'  <  0.  Applying  lemmas  3.1 
and  3.2,  we  conclude  that  for  each  i,  there  is  a  £?,  =  D*  £  Cnxn,Z?t  >  0  such  that 
yTi  =  D{XTi  and  for  each  j,  there  is  a  dj  £  R,  dj  >  0  such  that  ymj  =  djXmj .  Arranging 
all  of  these  Dfs  and  dfs  into  one  block  diagonal  D,  and  taking  the  hermitian  square 
root  proves  the  lemma,  jj 

Remark:  The  only  restrictive  assumption  we  have  made  in  the  above  lemma  is  that  the 
eigenvalue  Ac  is  distinct.  This  assures  differentiabilty.  Since  A0  is  a  solution  of  a 

max  max|Ai  (M A)  |,  it  is  likely  that  at  the  maximum  it  will  be  distinct. 

A€BA  > 
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13.3  Decomposition 

Recall  the  definition  of  V  from  section  7.  It  was  introduced  to  find  descent  directions  for 
a.  We  will  generalize  the  definition  to  be  valid  for  any  singular  value,  not  just  a. 


Let  M  be  a  complex  matrix  with  SVD 


M  =  0UV*  +  U2E2V2*. 


(13.11) 


In  this  setting,  0  is  any  singular  value  of  A/,  not  necessarily  <r(M),  but  none  of  the  singular 
values  in  S2  should  equal  0.  We  use  the  integer  r  >  0,  to  denote  the  multiplicity  of  0. 
Hence  U,V  €  Cnxr,U*U  =  VV  =  Ir,U2,V2  G  Cnx<"-r>, U;U2  =  V2’V2  =  7„_r. 

We  proceed  to  define  the  set  Va*,/j.  Partition  U  and  V  compatibly  with  A  as 


IJ  —  a* 
U  ~  Ex 


(13.12) 


where  Ai,B{  G  Cr‘Xr,  E^Fi  G  Cm*xr. 

For  r?  €  Cr,  with  ||t;||  =  1,  define  the  following  components 

P?  =  Am'A’  -  BiVV'B; 

p]  =  r  (E-Ej  -  F-Fj) 


Let  Va/0  C  X  be  the  set  of  all  such  Pv. 


(13.13) 


:=  {  diag  [P?, . . .  ,  /«,rf, . . .  :  /?,p?  ^  (13.13), r,  €  Cr,||i;||  =  l}.  (13.14) 

Note  that  here  we  use  two  subscripts  on  V.  The  first  is  the  matrix,  and  the  second  is  the 
singular  value  in  question.  The  main  reason  we  introduce  Vm,p  here  is  that  if  there  is  a 
singular  value,  0 ,  of  M,  and  0  €  V^,  then  0  is  a  lower  bound  for  p  (A/). 


Theorem  13.4  Let  M  and  a  compatible  block  structure  A  be  given.  Suppose  0  is  a 
singular  value  of  M  with  multiplicity  r.  Define  Vm,o  as  in  (13.14).  Then  0  G  SJ M 0  if  and 
only  if  there  exists  a  vector  x  G  Cn,  a  matrix  Xx  G  CnXn,  a  matrix  Q  G  Q,  such  that 
||x||  =  1,  x‘X±  =  0,  Xxx  =  0,  and 


QM  =  0xx*  +  .V_l 


(13.15) 


i 


m 


K 


I 


.< %*■  »*• .t* i a>«4»‘ iLa1**  >‘-L|  ‘*»'*  »  *■ 


t  JL1*  k_-  XT'  xj"  *j- 
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Proof:  Let  the  SVD  of  M  be 

M  =  PUV’  +  f/2S2y2*  (13.16) 

— ♦  If  0  €  Vjvfi(( j  then  there  exists  a?|6  Cr,  j|7/J|  =  1  such  that 

AiT)r)mA*  —  BtfrfB*  =0  i  <  s 
V"  (Ej'Ej  —  Fj’Fj)  tj  =  0  j<f-  1 

These  relations,  and  the  partition  in  (13.12)  imply  that  there  is  Q  €  Q  such  that 

QUr\  =  Vt}  (13.17) 

Define  x  G  Cn  as  the  above:  x  :=  QUtj  =  Vi).  Since  ||;?||  =  1  and  U  and  V  are 
isometries,  ||x||  =  1.  Simple  manipulation  of  (13.16)  and  (13.17)  gives 

{QM)  x  =  ( QM )  Vt]  =  pQUr)  =  aVr\  =  fix 

x*  {QM)  =  t)*U*Q *  {QM)  =  fa'V*  =  /3x* 

Defining  X±  :=  M  —  fixx*  completes  the  decomposition 

*—  Suppose  Q,  x,  and  Xx  are  given  as  in  the  hypothesis,  so  that 

QM  =  0xx*  +  X± 

Define  M  :=  QM.  A  singular  value  decomposition  of  M  is 

m  =  p  {qu)  {vy  +  (qu2)  s2  (v2y 


Hence  ft  is  a  singular  value  of  M,  and  Mx  =  fix  and  M*x  =  fix,  and  so  there  exists 
a  vector  77  €  Cr,  ||7?||  =  1  such  that 

x  =  QUr]  =  Vi) 

This  implies  that  0  €  V^,/ 3  as  desired.  H 

It  is  obvious  from  the  decomposition  that  ft  is  a  lower  bound  for  fi  {M)  since  0  is  an 
eigenvalue  of  M  =  QM.  The  following  corollary  follows  immediately. 

Corollary  13.5  Let  M  and  a  compatible  block  structure  A  be  given.  Suppose  D  6  T>, 
and  that  0  is  a  singular  value  of  DM D~x  with  multiplicity  r.  Define  Vdmd-i,/j  as  above. 
Then  0  €  ^dmd- ',0  if  and  only  if  there  exists  a  vector  x  €  Cn,  a  matrix  X±  £  Cnxn,  and 
a  matrix  Q  €  Q  such  that  |jx||  =  1,  x*X±  =  0,  Xxx  =  0,  and 


ijf 

$  « 


9  $ 


QDMD~X  =  0xxm  + 


(13.18) 


The  main  result  of  this  section  is  that  there  is  ( almost )  always  a  decomposition  as  in 
(13.18)  with  (3  =  p(M)  (remember,  any  j3  satisfying  (13.18)  is  a  lower  bound  for  p(M)). 
A  preliminary  result  toward  that  result  is  next. 


Theorem  13.6  Let  Q0  €  Q  be  the  optimizer  for  max  (QM),  and  suppose  that  the  eigen¬ 
value  associated  with  p(Q0M )  is  distinct,  call  it  XQ,  and  A0  is  real  and  positive.  If  x  and  y 
are  the  right  and  left  eigenvectors  of  the  eigenvalue  XQ,  then  there  exists  a  D  6  V,d  such 


that 


Q0Mx  =  XQx 
x*D2Q0M  =  X  0x'D2 


(13.19) 


Remark:  If  we  consider  local  maximums  of  a  function  f:  Q  — *  R  given  by  r  ( Q )  =  p  (QM), 
then  the  above  theorem  is  not  true.  For  r  as  defined  here,  there  exist  examples  where 
r  has  a  local  maximum,  but  the  decomposition  described  in  (13.19)  does  not  exist. 


Proof:  By  Theorem  13.3,  any  maximizer  of  max  (QM),  is  also  a  maximizer  of  max  (AM). 


Define  M  Q0M,  then  A  =  I  is  a  local  maximizer  for  max  Af) .  Apply  lemmas 
13.1  and  13.2  to  prove  the  theorem,  jl 


In  order  to  state  the  main  theorem,  we  introduce  some  additional  notation:  partition  the 
vectors  x  and  y  compatibly  with  the  block  structure, 


xri 

'  Vrr  ' 

Vt2 

^mi 

,  y  = 

Vr. 

Vm\ 

^m2 

J/mj 

.  f  . 

.  ymj  . 

(13.20) 


with  xTi,yTi  €  Cr‘  and  xmj,ymj  6  Cm>.  We  call  these  the  “block  components  of  x  and  y" . 


Theorem  13.7  Let  the  assumptions  of  Theorem  13.6  hold.  Consider  the  block  compo¬ 
nents  of  the  eigenvectors  x  and  y  as  in  (13.20).  If  for  all  i,  y*  xrj  ^  0,  and  for  all  j,  neither 
xmj  nor  ymj  are  the  0  vector,  then  there  exists  a  D  €  V  such  that 


Q0DMD~l  (Dx)  =  X 0(Dx) 
(Dx)’Q0DMD~l  =  A  0(Dx)m. 


(13.21) 


Remark:  This  result  was  first  shown  in  [FanTl,  for  the  case  of  s  =  0. 
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Proof:  These  additional  assumptions  guarantee  that  the  D's  in  Theorem  13.6  are  in  fact 
positive  definite,  rearranging  equation  (13.19)  gives  (13.21).  This  is  the  decomposi¬ 
tion,  since  Dx  is  both  a  right  and  left  eigenvector  of  Q0DM D~l  associated  with  the 
eigenvalue  A0.  f 


13.4  Lower  bound  power  algorithm 

How  can  this  decomposition  be  used?  In  this  section,  we  propose  an  iterative  algorithm  to 
find  such  decompositions,  and  therefore  get  lower  bounds  for  p.  The  possible  advantage 
this  algorithm  has  over  finding  local  maximums  of  the  max  p  ( QM )  lower  bound  is  that 
there  will  be  no  costly  eigenvalue /eigenvector  evaluations,  which  would  be  necessary  for 
cost /gradient  calculations.  Numerical  experimentation  indicates  that  the  algorithm  often 
completes  successfully  and  quickly. 

Rewriting  (13.21),  we  want  to  find  a  Q  E  Q,D  6.  Dg ,  fi  >  0,  and  x  €  Cn  with  ||xj|  =  1 
sucl  that 

QDMD~lx  =  fix 
D~lM*DQ*x  =  fix 

which  can  be  rewritten  as 

M{D~lx)  =  fi{D~xQ*x) 

M*  (DQmx)  =  fi  {Dx) . 

For  a  given  D ,  Q,  and  x,  define  vectors  a,  6,  z,  and  w  by 


b  :=  D~xx 
a  :=  D~lQ'x 
z  :=  DQ*x 
w  :=  Dx 


(13.22) 


With  this  definition,  we  have  Mb  =  era  and  M'z  =  crw.  We  can  eliminate  x  from  (13.22), 
and  redefine  D  =  D2  to  get 

b  =  Qa 
z  =  Q*w 
z  =  Da 
b  =  D~lw 

We  would  like  to  write  these  four  new  relationships  in  a  manner  that  does  not  involve  the 
matrices  Q  and  D.  With  a  few  technical  conditions,  this  can  be  done.  In  order  to  simplify 
the  upcoming  formulas,  we  will  consider  a  block  structure  with  s  =  1,/  =  1.  By  simply 
duplicating  the  appropriate  formulas  for  additional  blocks,  it  is  straightforward  to  extend 
the  algorithm  to  more  general  structures.  Hence  the  sets  T>  and  Q  look  like 

V  =  {diag [Dud1Imi]  :  £>i  €  C™,^  =  D\,d,  €  R} 

Q  =  {diag  [qiIri ,  Q2]  :  q;qi  =  1,  Q2  €  ,  Q’2Q2  =  /}  . 
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With  respect  to  this,  we  will  partition  the  vectors  accordingly,  so  2  = 
Zj  €  Cri  and  z2  €  Cmi ,  and  likewise  for  the  other  vectors. 


z\ 

L  *2  J 


where 


Lemma  13.8  Let  rj  and  mj  be  positive  integers.  Let  Zi,Wi,  bt,  at  6  Cn  and  z 2,  w 2,  62,  <12  6 
Cmi  be  nonzero  vectors  with  a\wx  7^  0.  Then,  there  exists  a  D  £  V,  and  Q  (=  Q  such  that 


b  =  Qa 
z  =  Q'w 
z  =  Da 
6  =  D~xw 


if  and  only  if 


a  iWt 

zi  -  tz — 


z2  = 


ojtoil 

Ml 


INI 


a2 


1  wiai  „ 

01  =  1 — ; — rai 


tniai 
®2  ii-  .  11^2 


IK|| 


Proof: 

— *■  Follows  by  direct  substitution. 


U>2 


Let  qi  =  | ,  since  this  is  well  defined.  Likewise,  choose  d2  =  L — -.  By  assumption, 

ll®2 1| 

d2  is  well  defined,  and  nonzero. 


Obviously,  ||tn2ll  =  ||z2||i  so  let  Q2  be  the  rotation  that  takes  w2  into  z2.  A  quick 
calculation  shows  that  Q 2  also  rotates  62  into  a2. 


Qih  —  —Q2w2  —  —  z2  =  a2 
CL  2  CL  2 


Next,  we  calculate  ajzi.  Plugging  in  gives  a\z\  =  |aju>i|.  By  assumption,  this 
is  nonzero,  hence  Lemma  13.2  yields  a  hermitian,  positive  definite  D j  such  that 
D\ai  =  zj.  As  we  hope,  D\  takes  b\  into  u>i  too. 


Dibi  =  qiDxax  =  qizx  =  uq 

Defining  D  and  Q  in  the  obvious  manner  completes  the  proof.  § 


This  gives  us  the  main  theorem. 


Theorem  13.9  Let  M  E  Cnxn  be  given,  and  let  A  be  the  two  block  (s  =  l,  f  =  1) 
structure  defined  above,  with  block  sizes  ri  and  mi,  where  ri  +  mi  =  n.  Suppose  ft  >  0  is 
given.  Then  there  exists  Q  E  Q,  £)  6  D,  x  =  [  i2  ]  ^  ^  Cnx^n_1^  such  that 


||xl,|  =  l,*i  ^  0  ,  x2  ^  0 
xmXj_  =  0  ,  Xj.x  =  0 
QDMD~l  =  /?xx*  + 


(13.25) 


if  and  only  if  there  exists  nonzero  vectors  zl,Wi,bi,ai  E  Cn  and  z2,w2,  b2,a2  E  Cmi  with 
ajtui  ^  0  and 

/3a  =  Mb 
a\vo , 

zi  =  — — ru>i 

_  i^ii  , 

M  (13.26) 


/3w  —  M“z 
_  wjai 

1  kr«ii  1 

-  M  t» 
62  “  IkT2' 


Remark:  In  order  to  find  decompositions  using  the  representation  this  theorem  allows 
(equation  (13.26)  -  free  of  Q’s  and  D' s),  we  can  restrict  ourselves  to  unit  vectors 
a,b,z,w.  Why?  Suppose  we  find  nonzero  vectors  satifying  (13.26).  Examining  these 
equations,  it  is  clear  that  scaling  z  and  w  by  q  ^  0  and  scaling  b  and  a  by  f3  ^  0 
does  not  affect  any  of  the  equalities  in  (13.26).  Since  these  equations  always  imply 
that  ||z||  =  ||w||,  and  ||aj|  =  ||6||,  we  can  indeed  look  only  at  unit  vectors. 

In  the  above  theorem,  we  have  written  the  conditions  (13.26)  in  a  suggestive  manner.  We 
will  attempt  to  find  solutions  to  (13.26)  in  a  iterative  fashion.  In  particular,  for  i  =  1,2, 
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let  vectors  aik ,  6tjk ,  Z{k ,  and  W{k  evolve  as 


h+iak+i  =  Mbk 

7  _  1fl 

2i*+i  i^1* 

lu,l*aU+il 

r  _  JKJl 

ll«^,ll  J"‘ 

0k+iwk+l  =  M*zk+l 


(13.27) 


&U+i  - 


_  aWi 


iau+iu,n+il 

L  _  KJI 

K,l^" 


where  and  /?*+!  are  chosen  >  0,  so  that  j|a*+i||  =  ||t«*+i||  =  1. 

Note  also  that  if  the  initial  b  and  w  vectors  that  start  the  iteration  are  unit  vectors,  then 

at  every  step,  all  vectors,  a,  6,  z,  and  w  will  be  unit  length. 

13. a  There  are  many  other  iterative  algorithms  besides  (13.27)  that  have  decompositions 
(Theorem  13.7)  as  equilibrium  points.  For  instance,  simply  rearranging  the  order  of 
our  iteration  in  (13.27)  will  yield  a  different  algorithm,  yet  decompositions  are  still 
the  equilibrium  points.  What  we  really  want  is  an  algorithm  where  the  only  stable 
equilibrium  points  are  decompositions  with  large  (relative  to  p(M))  converged  0 
values.  Other  iterations  schemes  may  be  better  suited  toward  this  goal  -  discovering 
them  will  give  a  better  lower  bound  algorithm. 

13. b  Potential  problems  are: 

•  Mbk  =  0  ( M*zk  =  0),  then  ak+l  (lUfr+x)  is  not  well  defined. 

•  a\kWlk  =  0,  then  the  vectors  zxk+1  and/or  b\k+1  are  not  well  defined. 

•  Either  ||u>2j|  =  0  or  ]|<z2fc ||  =  0,  making  62*  and/or  z2k  not  well  defined. 

The  heuristic  fix  when  any  of  these  happen  is  to  restart  the  algorithm  at  a  different 
initial  condition. 

13. c  If  everything  goes  ok,  and  all  of  the  indexed  quantities  converge,  then  we  must 
have  0  =  0.  This  is  easy  to  see.  Suppose  the  equations  in  (13.26)  are  satisfied 
(convergence  of  the  algorithm  in  (13.27)),  but  the  0  associated  with  b  and  a  is  0  and 
the  0  associated  with  z  and  w  is  0.  The  converged  equations  imply  that  there  exists 
a  Q  €  Q  and  D  €  V  such  that  QDMD~X  ( Db )  =  0{Db)  and  (QDM D~l)‘  (Db)  = 
0 (Db).  Since  the  0's  are  real,  they  must  be  equal. 
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13. d  •  If  there  is  only  the  first  block,  which  is  a  scalar  times  identity  block,  the  iteration 

would  be  a  power  iteration  for  the  largest  (in  magnitude)  eigenvalue  of  the 
matrix  M.  Since  p  for  1  scalar  times  identity  block  is  the  spectral  radius,  the 
algorithm  we  have  proposed  reduces  to  a  valid  algorithm  in  the  special  case  of 
1  scalar  times  identity  block. 

•  If  there  is  only  the  second  block,  which  is  a  full  block,  the  iteration  becomes 
a  eigenvalue  power  algorithm  for  MmM,  hence  it  will  give  the  largest  singular 
value  of  M.  Again,  with  respect  to  this  specific  block  structure,  this  is  what  we 
want. 

Hence,  the  iteration  we  have  proposed  is  a  even  mix  of  two  separate,  well  understood 
iterations.  Both  of  these  converge  to  the  largest  eigenvalue/singular  value.  Therefore, 
we  are  led  to  guess  (incorrectly)  that  this  algorithm  will  converge  to  the  largest  0 
for  which  a  decomposition  described  in  Theorem  13.4  exists. 

Extensive  computational  experience  has  led  to  the  following  conclusions: 

1.  The  difficulties  described  in  13. b  above  do  not  seem  to  occur  in  practice.  While  it 
is  easy  to  construct  matrices  where  these  problems  happen,  running  the  algorithm 
on  frequency  responses  of  actual  closed  loop  systems  has  not  been  a  problem. 

2.  Limit  cycles  occur  more  often  when  there  are  large  scalar  times  identity  blocks.  The 
presence  of  a  stable  limit  cycle  does  not  immediately  give  rise  to  a  lower  bound  for 
P- 

3.  If  s  =  0  (and  often  times  when  s  >  0),  the  algorithm  usually  performs  well,  con¬ 
verging  quickly,  and  providing  a  lower  bound  which  is  better  than  p  ( M ).  We  have 
successfully  run  tests  on  40  x  40  complex  matrices  with  up  to  40  complex  uncertain¬ 
ties. 

4.  The  promising  properties  described  above  are  not  always  true.  We  have 
examples  of  a  stable  equilibrium  point  with  the  corresponding  0  <  p  (A/).  With  lack 
of  any  further  insight,  we  do  not  bother  to  reproduce  this  here.  The  block  structure 
was  five  lxl  blocks. 

5.  In  general,  there  are  several  stable  equilibrium  points,  with  different  values  of  a. 
This  is  to  be  contrasted  with  the  conventional  power  algorithms  for  p  and  a,  where 
only  the  largest  ones  are  stable. 
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13.5  Choosing  starting  vectors 

This  section  heuristically  addresses  the  question  of  “what  should  the  starting  vectors  be?” 
To  motivate  what  follows,  suppose  that  fi{M)  =  ini  a  (.DA/D-1),  and  that  the  infimum 
is  achieved  by  D0.  Then,  from  Theorem  13.4,  we  must  have  0  €  ^ d0md~1,&-  Therefore,  if 


M  :=  D0MD ;l  =  nUV  +  f72£2U2* 

is  a  singular  value  decomposition,  there  is  a  t 7  6  Cr  and  Q  €  Q  such  that 

MVtj  =  fiQVrj 
M*  (QVi 7)  =  fiVr) 

Hence,  with  respect  to  M,  (which  has  fi(M)  =  /x(A/)),  the  vectors 


(13.28) 


(13.29) 


(13.30) 


b  :=  Vr) 
w  :=  Vr) 

are  the  correct  vectors  for  the  decompostion.  We  therefore  propose  the  following. 


1.  Using  a  cheap  method,  (Osbj,  find  a  Dso  that  nearly  minimizes  inf  <7  [DMD  *) 

2.  Absorb  this  into  M,  ie.,  define  M  :=  D,0MD~X 

3.  choose  bi  =  to  be  a  right  singular  vector  associated  with  a  (m'J 

4.  perform  the  iteration  on  M  with  these  starting  vectors 
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To  conclude,  we  analyze  the  robustness  of  a  nominally  stable  system  subject  to  struc¬ 
tured  perturbations.  The  example  has  no  physical  interpretation,  and  is  intended  only  to 
illustrate  the  various  robustness  theorems  we  have  presented. 


The  system  G  (which  can  be  interpreted  as  Ft  ( P(s),K(s ))  as  in  the  previous  section)  is 
given  below.  It  has  4  states,  and  9  inputs  and  9  outputs. 


Matrix  :  O.a 


atataa  4 


xl 

x2 

x3 

x4 

xl 

-6.406a-01 

-S.471a+00 

-4.18Sa +00 

2.198a+00 

x2 

O.OOOa +00 

-3.000a+00 

O.OOOa+OO 

O.OOOa+OO 

x3 

2. 198a +00 

-S.098a+00 

-2.627a+00 

4.185a +00 

x4 

-1 ,987a+00 

-6 . 756a+00 

-8.384a+00 

1.558a+00 

Matrix  :  Q.b 

states  4  inputs  9 

ul 

m2 

n3 

n4 

nS 

u6 

a7 

u8 

a9 

xl 

5. 338a -01 

O.OOOa +00 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

7.000a-01 

x2 

O.OOOa +00 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

2 . 668a-01 

8. 893a -02 

O.OOOa+OO 

O.OOOa+OO 

7.000a-01 

x3 

5 . 336a-01 

O.OOOa +00 

5. 336a -01 

2 . 668a-01 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

7.000a-01 

x4 

O.OOOa +00 

S.338a-01 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

-8.893a-02 

8 . 893a-02 

7. 000a -01 

Matrix  :  O.c 


atataa  4  output*  9 


xl 

x2 

x3 

X4 

Tl 

2 . S00a-01 

O.OOOa+OO 

2.500a-01 

O.OOOa+OO 

72 

O.OOOa+OO 

2.S00a-01 

O.OOOa+OO 

O.OOOa+OO 

J3 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

2.500a-01 

74 

O.OOOa+OO 

S.000a-01 

O.OOOa+OO 

O.OOOa+OO 

7® 

O.OOOa+OO 

S.000a-01 

O.OOOa+OO 

S.000a-01 

7« 

O.OOOa+OO 

O.OOOa+OO 

O.OOOa+OO 

l.SOOa+OO 

77 

O.OOOa+OO 

1 ,600a +00 

O.OOOa+OO 

O.OOOa+OO 

78 

O.OOOa+OO 

O.OOOa+OO 

l.SOOa+OO 

1 . 500e+00 

79 

O.OOOa+OO 

-l.OOOa+OO 

O.OOOa+OO 

1.000a +00 

The  first  8  inputs  and  outputs  are  associated  with  the  perturbation  structure,  A ,  s  = 
3,  /  =  0,  rx  =  3,  r2  =  2,  r3  =  3.  The  last  input  and  output  correspond  to  the  exogenous 
disturbance  and  resulting  error.  Hence,  for  robust  performance  calculations,  we  will 
append  a  1  x  1  full  block  to  A  for  the  performance  calculation.  For  notational  purposes, 
we  partition  G(s)  into 


G 


Gn  G 12 

G  21  G  22 


(14.1) 


where  £?n(s)  is  8  x  8,  and  £22(3)  is  lxl. 
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We  first  calculate  the  robustness  with  respect  to  linear,  time  invariant  perturbations,  using 
the  frequency  domain  techniques  described  in  the  section  11.  This  is  done  via  a  (i  test  on 
Gn(ju>).  At  each  frequency  point,  we  calculate  the  a  ( DMD~X )  upper  bound,  and  a  lower 
bound  using  the  algorithm  described  in  section  13. 

•  Figure  14.1  is  simply  the  singular  values  of  Gu(jjj)  versus  u>.  This  implies  that 
for  any  unstructured  (any  8  x  8),  stable  perturbation,  with  induced  norm  from 
L 2  — *  Li  less  than  A-j,  the  perturbed  closed  loop  system  is  stable.  The  Nyquist 
argument  also  shows  that  there  is  a  linear,  time  invariant,  unstructured,  stable 
perturbation,  Au  with  sup  a  (A(juj))  =  ^  that  does  cause  instability. 

•  Next,  we  calculate  upper  and  lower  bounds  for  n(Gu(ju)),  with  respect  to  the 

block  structure  A  :=  {diag  [£1/3,  63/3]  ■  G  C).  The  upper  bound  is  based 

on  the  generalized  gradient  material  from  section  6.1,  and  the  lower  bound  is  the 
iterative  procedure  described  in  section  13.  These  two  curves  are  nearly  equal,  and 
axe  shown  in  Figure  14.2.  This  implies  that  for  FDLTI  perturbations  A(s)  with  the 
correct  structure,  the  stability  is  preserved  as  long  as  ||A(s)j|oo  <  and  there  is 
a  perturbation  on  that  boundary  that  does  cause  instability. 


*(g„M) 


ji(G,i(jw)) 


frequency  ( r«dlana/a«cond) 


frequency  (radians/ second) 


Figure  14.1  Frequency  Response  Singular  Value  Plot 


Figure  14.2  Frequency  Response  p  plot 


•  What  about  performance?  Nominally,  the  transfer  function  G 22  describes  the  per¬ 
formance,  and  this  is  shown  in  Figure  14.3.  It  has  a  peak  value  of  0.83.  Under 
perturbations  this  becomes  FU(G,  A).  To  analyze  the  degradation  of  performance 
due  to  the  uncertainty,  we  use  theorem  11.4,  and  an  augmented  block  structure  A 


A  :=  {diag  [A, 62]  :  A  G  A,  <^2  €  C} 


(14.2) 


A  plot  of  ( G(jui ))  is  shown  in  Figure  14.4.  Applying  a  scaled  version  of  theorem 


fr«qu«ncy  (radians/second) 


frequency  (radians/second) 


Figure  14.3  Nominal  Disturbance  to  Error  Frequency  Response  Figure  14.4  |i  Plot  for  Robust  Performance 

Finally,  we  consider  robust  stability  to  time  varying  perturbations,  using  the  optimal 
constant  D  scaling  result  from  section  10  to  minimize  the  conservatism  of  the  small  gain 
theorem,  by  taking  into  account  the  structure.  This  will  give  a  sufficient  condition  for 
robust  stability  to  time  varying,  and  also  cone  bounded,  nonlinear  perturbations  as  well. 

(The  correct  formulas  for  continuous  time  systems  are  given  in  the  appendix,  and  are 
in  the  same  spirit  as  (10.9)  and  (10.10)).  Everything  pertains  to  Gn,  since  we  are  only 
concerned  with  stability.  From  Figure  14.2,  we  know  that  the  optimal  value  satisfies 

finf  |jZ?Gru(s)£>_1Hoo  >  sup  (DuGn(joj)Dw~l>)  =  0.64  (14.3) 

We  performed  a  1  dimensional  search  to  find  the  correct  value  of  7  (in  equation  (10.10)). 

Our  rather  crude  gradient  algorithm  indicates  that  7  £  (0.68,0.685).  This  is  quite  close 
to  the  frequency  varying  optimal.  The  constant  D  scaling  we  get  from  setting  a  =  0.685 
is  given  below. 


Matrix  :  D.opt 


BLOCK  DliOOIiL 


rows  8  colnana  8 


1  1.113a+02  -7.3894+01  6. 100«+00 

2  -2.8824+01  -1.6384+01  5.8274+01 

3  8.5374+00  7 . 397a+01  3.2104+01 


4  3.7164+01  -3.1394+01 

5  4 . 5864+01  1.9754+01 


$ 


$ 

A 

ft 
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6  7  8 

6  3. 118«+00  -1.112«+01  1.11S««01 

7  6.516«+00  -2.682«+00  -3 . 690«+O0 

8  1.623«+01  9.812«+00  -8.816«+00 


If  we  scale  Gu(ju>)  with  this  constant  scaling  Figure  14.5  results.  Note  that  the  problem 
is  of  the  sort 

^nf^  sup  Oi  [func(Z),u>)] .  (14.4) 

all  a 

We  expect  coalesced  behavior  at  the  minimum,  and  this  is  exactly  what  we  have.  In  this 
instance,  though,  the  coalesced  behavior  is  with  respect  to  the  u>  variable  -  Figure  14.5 
shows  this  very  clearly. 


frequency  < radians/ second) 


Figure  14.5  Singular  Value  Plot  with  Optimal,  Constant  D  Scales 

As  we  noted,  this  example  has  no  physical  significance,  it  merely  demonstrates  several  of 
the  different  ideas  we  have  covered  in  this  report,  namely  frequency  domain,  and  state 
space  ft  techniques,  as  well  as  the  optimal  constant  scaling  material  of  section  10  for  time 
varying  perturbations.  Several  realistic  examples  using  fi  have  appeared  in  the  literature, 
including  [DoyLP]  and  [DoySE].  The  emphasis  in  each  of  these  is  a  particular  example  - 
the  various  uses  and  interpretations  possible  with  the  different  fi  calculations  are  not  the 
main  issues. 


w. 
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15  Appendix 

15.1  Star  Products 


Recall  the  example  from  section  8.3.1  and  the  results  on  uncertain  difference  equations  in 
section  9.  Both  of  these  were  done  in  discrete  time,  since  in  that  domain,  the  unit  disk 
is  important,  and  disks  are  what  p  is  all  about.  This  section  shows  that  the  well  known 
bilinear  transformation  yields  results  analogous  to  the  above  for  continuous  time  systems. 
We  begin  with  a  generalization  of  the  LFT,  called  “the  star  product”  which  is  found  in 
[Red]. 


Suppose  Q  and  M  are  two  complex  matrices,  which  we  partition  as 


Q  = 


Q ii  Q 12 

Q  21  Q  22 


M  = 


Mu  Mi2 

A^21  Af22 


We  are  a  little  cavalier  about  the  dimensions  here.  We  only  require  that  the  matrix  product 
Q 22 Mu  makes  sense  and  is  square.  Obviously  then,  the  product  MnQ2 2  also  makes  sense 
and  also  is  square.  If  the  matrix  I  —  Q22M11  is  invertible,  then  we  say  the  star  product 
Q*M,  is  well  defined,  and  is  given  by 


Q*M  := 


Ft(Q,Mn)  Qn{I  -  M11Q22)-1  Mu 

[  M2l  {I  —  Q22M11)  1  Q2I  -F„(Af,  Q22) 


(15.1) 


Note  that  this  definition  is  dependent  on  the  partitioning  of  the  matrices  Q  and  M  above. 
In  fact  it  may  be  well  defined  for  one  partition  and  not  well  defined  for  another.  However, 
we  will  not  explicitely  show  this  dependence,  as  it  is  always  clear  from  the  context. 

In  a  block  diagram,  the  star  product  (15.1)  has  a  natural  interpretation:  it  is  simply  the 


matrix  relating 


«i 

V.2 


to 


V\ 

V2 


as  shown  below. 


M 


Figure  15.1  Star  Product  of  Two  Matrices 


The  assumption  that  /  —  Q22M11  is  invertible  implies  that  for  any  vectors  ui  and  u2, 
there  exist  unique  vectors  z  and  w  satisfying  the  loop  equations.  When  working  with  star 
products,  it  is  much  easier  to  manipulate  the  diagrams,  rather  than  the  equations ,  since 


the  diagrams  are  so  intuitive,  however  a  little  care  must  be  exercised.  Consider  the  loop 
below. 


Figure  15.2  Associativity  of  Star  Products 

Should  this  be  viewed  as  ( Q*M)*S  or  Q*(Af*S)?  Well,  when  looking  at  it  pictorially, 
it  appears  to  make  no  difference.  But,  we  have  to  be  careful  about  the  invertibility  of  the 
necessary  matrices.  For  example,  suppose  all  the  matrices  are  2x2,  and  that  Q 22  =  0.5, 
M  =  j  |  ,  and  Sn  =  1  (the  rest  of  Q  and  S  are  irrelevent).  Certainly  Q*M  is  okay, 

and  since  [Q*M\22  =  2,  the  quantity  1  —  is  invertible,  and  therefore  ( Q*M)*S 

is  well  defined.  But,  since  the  star  product  M *S  is  not  even  defined,  we  cannot  compare 
the  first  expression,  ( Q*M)*S  to  Q*(M*S). 

So,  if  we  want  to  have  associativity  (which  is  what  to  need  to  manipulate  the  diagrams, 
rather  than  working  via  the  fairly  messy  definition),  both  Q*M  and  M*S  should  be  well 
defined.  This  requires  that  both  I  —  Q22A/11  and  I  —  M22SU  are  invertible.  In  this 
case,  the  next  lemma  and  corollary  show  that  if  either  ( Q*M)*S  or  if  Q*(M*S)  is  well 
defined,  then  they  are  equal. 

Lemma  15.1  If  both  I  —  Q22A/11  a nd  I  —  A/22S11  are  invertible,  then  the  quantity 
I  —  FU(M,  Q22)  5n  is  invertible  if  and  only  if  I  —  Ft  (M,  Su)  Q22  is  invertible. 

Proof:  We  manipulate  determinants:  det  [I  —  Fu  (A/,  Q 22)  S11]  7^  0 

det  1/  —  \m22  +  A/21Q22  (I  —  MuQ22)  1  Ml2]  Snj  7^  0 
*-+  det  j/  —  A/22  Si  1  —  A/21Q22  (/  —  A/11Q22)  1  M12Sn}  7^  0 

det  il  —  M2\Q22  (I  —  M11Q22)  1  A/12S11  (I  —  M22S11)  1 1  7^  0 

*—►  det  —  Mx2Su  (/  —  M22Su)  M2iQ22  (I  —  M\iQ22) 

det  |/  —  MuQ22  —  M12S11  (/  —  A/225’11)  1  M2\Q22^  0 

~  det  {/  —  Ft  (M,Su)  Q22}  7^  0  J 


This  implies  the  corollary. 


Corollary  15.2  Let  Q,  M,  and  S  be  given.  If  Q*M  and  M*S  are  each  well  defined  (ie. 
I  —  $22 Mu  and  I  —  M22S11  are  invertible),  then  ( Q*M)*S  is  well  defined  if  and  only  if 
Q*(M*S)  is  well  defined.  Furthermore,  if  they  are  well  defined,  then  they  are  equal. 

These  star  products  have  many  interesting  properties  discovered  in  [Red].  We  will  not 
pursue  them  here.  In  the  next  section  though,  we  use  star  products  to  translate  the 
discrete  time  results  from  sections  8.3.1  and  9  to  analogous  continuous  time  results. 


15.2  Continuous  time  results 

In  this  section  we  show  that  the  well  known  bilinear  transformation,  along  with  the  star 
product,  yields  results  for  continuous  time  systems. 

Let  n  >  0  be  an  integer,  and  define  a  matrix  B  by 


In  V2In 
V2  In  In 


Suppose  A  €  Cnxn.  It  is  simple  to  relate  the  eigenvalues  of  A  and  F/  (B,  A).  In  particular, 

Lemma  15.3  Let  A,,  i  =  1, . . .  ,  n,  denote  the  eigenvalues  of  A.  Then  Re  (A,)  <  0  for  each 
i  if  and  only  if  I  —  A  is  invertible,  and  p  [Ft  (B,  A)]  <  1. 

Similarly,  we  have  a  matrix  version  of  the  bilinear  transformation. 

Lemma  15.4  Suppose  A  G  CnXn.  Let  AH  :=  |(A  +  A*).  Then  AH  <  0  if  and  only  if 
I  —  A  is  invertible  and  o(Fi  (B,  A))  <  1. 

Proof:  Suppose  that  I  —  A  is  invertible  and  a{Fi  (B,  A))  <  1.  Then 

a  (Ft  (B,  A))  <  1  iff  o  ((/  +  A)  (7  -  A)-1)  <  1 

iff  (/  —  A*)-1  (I  +  A*)  (I  +  A)(I  -  A)-1  <  / 
iff  (7  +  A*)(/+A)<  (7- A*)(/-A) 
iff  Am  +  A  <  -A’  -  A 
iff  Ah  <  0 

Reversing  the  steps  gives  the  proof  for  the  other  direction,  jj 


Now,  let  us  apply  this  to  a  class  of  uncertain  differential  equations,  as  we  did  for  the  discrete 
case  in  section  9.  To  set  it  up,  let  M  £  C*n+m)*(n+m)  be  given,  along  with  a  m  x  m  block 
structure  A,  such  that  p&  (A/22)  <  1.  We  are  interested  in  solutions  x(t)  6  Cn  that  evolve 
according  to 

x  =  Ft  (M,  A (t))  x 

where  the  function  A(-)  is  piecewise  continuous,  say.  We  assume  that  the  nominal  system 
is  known  to  be  stable,  hence  all  of  the  eigenvalues  of  Mn  have  negative  real  parts.  Consider 
the  following  three  assumptions  on  A(-) 

(a.l)  For  all  t,  A (t)  £  A 

(a.2)  For  all  t,  a  (A  (t))  <  1 

(a.3)  A(-)  is  constant  -  it  does  not  vary  with  t 

Now,  (a.3)  implies  that  the  system  is  time  invariant,  so  we  just  need  to  check  that  the 
dynamic  matrix,  Ft  (M,  A)  is  hurwitz  for  each  allowable  A.  Equivalently,  via  Lemma  15.3, 
we  need  to  check  that  p  [Ft  (B,  Fi  (M,  A))]  <  1  for  all  allowable  A.  This  is  displayed  in 
block  diagram  form  below  on  the  left. 


Figure  15.3  Uncertain  Differential  Equations 

We  would  like  to  exchange  the  order,  and  evaluate  whether  p[Fi  (B  *A/) ,  A]  <  1  for  all 
A,  because  this  is  just  a  p  test  on  B *M.  This  is  illustrated  above  right.  Theorem  15.5 
handles  this. 

Theorem  15.5  Define  A  :=  {diag  [<5/„,  A]  :  6  £  C,A  £  A}.  Then,  with  the  above  as¬ 
sumptions,  the  differential  equation  x  =  Fi(M,  A)x  is  stable  for  all  fixed  A  £  A,  with 
a  (A)  <  1  if  and  only  if  (B  *M)  <  1. 

Proof:  Since  the  nominal  matrix  Mu  has  all  of  its  eigenvalues  in  the  left  half  plane,  the 
star  product  B*A/  is  well  defined.  Also  by  assumption,  p&  (A/22)  <  1,  hence  for  every 
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A  G  A,  with  <j( A)  <  1,  the  LFT  F;(M,  A)  is  well  defined.  Hence,  the  standing 
assumptions  of  lemma  15.1  (or  corollary  15.2)  are  satisfied. 

By  hypothesis,  if  A  €  A,  and  a  (A)  <  1,  then  I  —  [B*Af]22  A  is  invertible,  and  hence 
so  is  I  —  Fi  ( M ,  A).  Therefore,  by  corollary  15.2,  for  all  such  A,  we  have 


Therefore 


Fi(B,Fi(M,A))  =  F/(B*M,  A) 


max  p  [F;  (B,  F;  (A/,  A))]  =  max  p  [Fj  (B*M,  A)]  <  1 

Ae A  AeA 

a(A)<i  a(A)<i 


where  the  last  inequality  comes  from  the  assumption  that  ( B*M )  <  1,  and  the 
robust  performance  theorem,  5.2,  applied  to  B  *M,  with  the  block  structure  A. 
Hence,  using  lemma  15.1  shows  that  the  eigenvalues  of  F/  ( M ,  A)  are  in  the  left  half 
plane  for  each  A. 

— ►  Same  type  of  argument,  jj 

Similar  results  are  obtained  for  the  other  situations.  We  collect  them  here. 

Lemma  15.6  There  is  a  single  Lyapunov  matrix  for  the  entire  set  of  “A”  matrices 


if  and  only  if 


{F(  (M,  A)  :  A  €  A,  a  (A)  <1} 


riSL  «  ([  0  ?  ]  <B*M>  [  7  I  ]) 

Tinvertible 


where  A  :=  {diag  [Ai,  A]  :  Ax  6  CnXn,  A  G  A}. 

Lemma  15.7  Let  V  {diag [DiIn,D]  :  D\  G  Cnxn,  invertible,!)  G  T>},  where  V  is  the 
appropriate  scaling  set  for  the  block  structure  A.  Then  a  sufficient  condition  for  Lemma 
15.6  is 

inf  o(D  B *M  b~r)  <  1 

We  can  also  use  B  and  the  star  product  to  switch  between  z  and  s  domains  for  transform 
results. 


■oww'i: 
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Lemma  15.8  Let  A,B,C,D  be  a  state  space  realization  of  a  stable,  continuous  time 
transfer  function  G(s),  with  m  inputs  and  m  outputs.  We  assume  that  the  matrix  A  is 
Hurwitz.  Define  the  matrix  M  as 


M  := 


A  B 
C  D 


Let  A  :=  {diag  [£,/„,  A2]  :  Sx  G  C,  A  €  Cm*m).  Then 

||G||oo  <  1  iff  Ma(B *M)  <  1 


Lemma  15.9  Let  G(s )  :=  D  +  C  (si  —  ,4)_1  B  be  am  input,  m  output,  rational,  stable 
transfer  function.  Suppose  A2  is  a  m  x  m  block  structure  as  in  (3.1),  and  let  T>2  be  the 
corresponding  scaling  set.  For  a  >  0,  define 


M°  := 


A  B 
aC  aD 


Define  7  6  R  by 


7  :=  sup 

a>0 


a  :  inf  d 

Di  invertible 
DjeVi 


DX  0 

0  Z?2 


(B*Af“) 


D-xl 

0 


dt1 


<  1 


(15.2) 


(15.3) 


Then 


cleft  S“S  KagM£>2-')  =  (;« 


a€C 
Re(j)>0 


(15.4) 


15.3  Convexity  Lemma 


The  following  lemma  gives  a  sufficient  condition  for  a  continuous  function  from  R— >R  to 
be  convex.  It  is  fairly  intuitive,  and  comes  from  [ChuD] 

Lemma  5.2  Let  f  :  R  — ►  R  be  a  continuous  function,  and  suppose  for  each  t0  G  R, 
there  exists  function  gto  €  C2  (continuously  twice  differentiable),  gto:  R— >•  R,  such  that 
f(to)  =  ffto(t<>)>  f(t)  >  9t0(t)  for  all  t  G  R  and  typ-l  _  >0.  Then  f  is  a  convex  function. 

\t  —  t0 

Proof:  Suppose  /  is  not  convex.  Then  there  exist  x,y  G  R,  x  <  y,  and  A  G  (0, 1)  such 
that 

/((l  -  A)x  +  Ay)  >  (1  -  \)f  (x)  +  \f  (y) 


Let  0  be  the  largest  difference  this  assumes,  ie. 

0  =  max  [f((l  -  a)x  +  ay)  -  (1  -  a)f  (x)  -  af  (y)] 

a€[Otl] 

and  let  A  be  the  largest  value  in  [0, 1]  that  achieves  0.  Obviously,  since  0  >  0, 
A  €  (0,1).  Define  w  :=  (1  —  A)x  +  Ay.  Hence  /  is  continuous,  satisfies  f(w)  =  0 , 
and  lies  in  the  shaded  region  as  shown  below  (shaded  region  includes  its  boundary 
for  t  <  w,  and  does  not  include  its  boundary  for  t  >  w). 


Figure  15.4  Diagram  for  Lemma  5.2 

(Pa 

Now,  let  g  be  any  function  in  C 2  with  g{w)  =  0 ,  and  —— 

at*  .  _ 

t=w 

are  points  w  arbitrarily  close  to  w  such  that  /( w)  <  g(w).  So,  by  contrapositive,  we 
have  proven  the  lemma.  | 


>  0.  Obviously,  there 


s 
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